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Finding Hamilton cycles in random graphs with few queries 


Asaf Ferber * * Michael Krivelevich 1 Benny Sudakov 1 Pedro Vieira § 


Abstract 

We introduce a new setting of algorithmic problems in random graphs, studying the minimum 
number of queries one needs to ask about the adjacency between pairs of vertices of Q(n,p) 
in order to typically find a subgraph possessing a given target property. We show that if p > 
inn+ininn+o;(i) , one can fj nc [ a Hamilton cycle with high probability after exposing (l + o(l))n 

edges. Our result is tight in both p and the number of exposed edges. 


1 Introduction 

Random Graphs is definitely one of the most popular areas in modern Combinatorics, also having 
a tremendous amount of applications in different scientific fields such as Networks, Algorithms, 
Communication, Physics, Life Sciences and more. Ever since its introduction, the binomial random 
graph model has been one of the main objects of study in probabilistic combinatorics. Given a 
positive integer n and a real number p £ [0,1], the binomial random graph G(n,p) is a probability 
space whose ground set consists of all labeled graphs on the vertex set [n]. We can describe the 
probability distribution of G ~ Q{n,p) by saying that each pair of elements of [n] forms an edge in 
G independently with probability p. For more details about random graphs the reader is referred to 
the excellent books of Bollobas [3] and of Janson, Luczak and Rucinski HD. 

Due to the importance and visibility of the subject of Random Graphs, and also due to its 
practical connections and the fact that random discrete spaces are frequently used to model real 
world phenomena, it is only natural to study the algorithmic aspects of random graphs. The reader 
is advised to consult an excellent survey of Frieze and McDiarmid on the subject m, providing an 
extensive coverage of the variety of problems and results in Algorithmic Theory of Random Graphs. 
In this paper we present an apparently new and interesting setting for algorithmic type questions 
about random graphs. 

Usually, questions considered in random graphs have the following generic form: given some 
monotone increasing graph property V (that is, a property of graphs that cannot be violated by 
adding edges) and a function p = p{n) £ [0,1], determine whether a graph G ~ G{n,p) satisfies V 
with high probability (whp) (that is, with probability tending to 1 as n tends to infinity). In order to 
solve questions of this type, one should show that after asking for all possible pairs (i,j) of distinct 
elements of [n] the question “is (i, j) € E(G)T' and getting a positive answer with probability p(n) 
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independently, whp the graph G obtained from all positive answers possesses V. Here we propose a 
different task. Given p such that a graph G ~ Q(n,p) whp satisfies V, we want to devise an algorithm, 
probably an adaptive one, that asks typically as few queries “is (i,j) € E{G)T’' as possible, and yet 
the positive answers reveal us a graph which possesses V. We refer to such an algorithm as an 
adaptive algorithm interacting with the probability space Q{n,p). For example, consider the case 
where V is the property “containing a Hamilton cycle” (i.e. a cycle passing through all the vertices 
of the graph). In this case we aim to find an algorithm that will adaptively query as few pairs as 
possible, yet a sufficient amount to get whp a Hamilton cycle between the positive answers. It is 
important to remark that we are not concerned here with the time of computation required for the 
algorithm to locate a target structure (thus essentially assuming that the algorithm has unbounded 
computational power), but we make the algorithm pay for the number of queries it asks, or for the 
amount of communication with the random oracle generating the random graph. Therefore, in this 
sense this setting is reminiscent of such branches of Computer Science as Communication Complexity 
and Property Testing. 

In general, given a monotone property V, what can we expect? If all n-vertex graphs belonging 
to V have at least m edges, then the algorithm should get at least m positive answers to hit the 
target property with the required absolute certainty. This means that the obvious lower bound in 
this case is (1 + o(l))m/p queries, and therefore, whp one would have (1 + o(l))m positive answers. 
Continuing with our example of Hamiltonicity, this lower bound translates to (1 + o(l))n positive 
answers. This should serve as a natural benchmark for algorithms of such type. Of course, the above 
described framework is very general and can be fed with any monotone property V, thus producing 
a variety of interesting questions. 

Here is a very simple illustration of our model. Let us choose the target property to be connect¬ 
edness (i.e., the existence of a spanning tree) in G ~ Q{n,p)). Suppose the edge probability p{n) is 
chosen to be above the threshold for connectedness, which is known to be equal to p(n ) = 

In this case the minimum number of positive answers to the algorithm’s queries is obviously n — 1. 
An adaptive algorithm discovering a spanning tree after n — 1 positive answers is very simple: start 
with T = v, where v £ [n] is an arbitrary vertex, and at each step query in an arbitrary order 
previously non-queried pairs between the current tree T and the vertices outside of T until the first 
such edge ( u,w ) € G has been found, then update T by appending the edge ( u,w ). Assuming the 
input graph G is connected, the algorithm clearly creates a spanning tree of G after exactly n — 1 
positive answers. 

In this paper we focus on the property of Hamiltonicity, which is one of the most central notions 
in graph theory, and has been intensively studied by numerous researchers. The earlier results on 
Hamiltonicity of random graphs were proved by Korshunov m and by Posa [16] in 1976. Building 
on these ideas, Bollobas [4J, and Komlos and Szemeredi [12] independently showed that for p > 
in ra+in in n.+tii(i) ^ a g^pp q ^ i s w hp Hamiltonian. This range of p cannot be further improved 

since if p < lnn + lnl ^ Lri ~ a; ( 1 ) ; then whp a graph G ~ Q(n,p) has a vertex of degree at most 1, and such 
a graph is trivially non-Hamiltonian. 

In the following theorem, which is the main result of this paper, we verify what the reader may 
have suspected: (1 + o(l))n positive answers (and thus, (1 + o(l))n/p queries) are enough to create 
a graph which contains a Hamilton cycle, for every p > Inn + Inl ° ri + a, ( 1 ) _ 


Theorem 1. Let p = p{n) > lnn+lnl “ ?t +^l 1 ) _ Then there exists an adaptive algorithm, interacting 
with the probability space Q(n,p), which whp finds a Hamilton cycle after getting (1 + o(l))n positive 
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answers. 


Note that Theorem [Tj is asymptotically optimal in both the edge probability p and the number 
of positive answers we get. We remark that if we allow (say) 3 n positive answers, and replace 
p{n) > Inn + lnl ° w + CJ ( 1 ) w ith p{n) > (1 + e) lnn/n, then the result follows quite easily from a result of 
Bohman and Frieze [2] by effectively embedding some other random space, like a 3-out, in Q(n,p) and 
accessing it in few queries. Moreover, if our goal is to find a Hamilton cycle after having (1 + o(l))n 
positive answers but we are willing to weaken the assumption to p = a;(log n/n), then in fact the 
problem becomes much easier. In what follows, as an illustrative example, we provide a sketch of a 
proof for this statement. 

Proposition 1 . 1 . Let p = cu(logn/n). Then there exists an adaptive algorithm, interacting with the 
probability space Q(n,p), which whp finds a Hamilton cycle after getting (1+ o(l))ro positive answers. 

Proof (sketch). Let p = /(n) log n/n, where / := /(n) is an arbitrary function tending to infinity 
with n, and set q\ = (1 — e)p (where, say, e = / -1 / 3 ) and q 2 to be the unique solution to 1 — p = 
(1 — gi)(l — qfi) 4 "- Our proof consists of two phases, where in Phase 1 we find a “long” path, and in 
Phase 2 we close it into a Hamilton cycle. 

Phase 1 In this phase we construct a path P of length t := n — m where m = n/\ff , while 
exposing edges in an “online fashion” with edge probability q\ , after exposing (successfully) exactly 
t edges. At each time step 0 < £ < t — lof this phase we try to extend a current path P ^ := vq ... V( 
by exposing an edge of the form V£U , where u £ [n] \ V(Pe). Note that the probability to fail in step £ 
is (1 — qi) n_ ^ +1 ) < e~ qim = o(l/n), and therefore, by applying the union bound, whp the algorithm 
does not terminate unsuccessfully during this phase. 

Phase 2 In this phase we want to turn the path P := vq ... Vt obtained in Phase 1 into a Hamilton 
cycle. To this end, we define an auxiliary directed graph D, based on a subgraph of G, and show that 
a directed Hamilton cycle in this graph exists whp, and that such a cycle corresponds to a Hamilton 
cycle of G. Moreover, we show that D contains 0(km) = o(n) edges. This will complete the proof. 

Let U := ([n] \ V(P)) U {up} and set V(D ) = U. Let us choose (say) k := 100 (actually, any 
k > 2 will work, but for a large k the proof of the tool we base our argument on becomes relatively 
simple). We choose the arcs (directed edges) of D according to the following procedure: 

• For every v € U \ {up}, we define 


Out(u) = (U — v) U {uq} 


and 

In(u) = (U — v) U {ut}. 

Now, in G , iteratively expose edges of the form vu, u € Out(u), with probability q 2 , indepen¬ 
dently at random, according to a random ordering of Out(u), until you have exactly k successes. 
Let xi,X 2 , ■ ■ ■, Xk denote these successes, and add vxi, 1 < i < k as arcs to E(D ) (if one of the 
xfs is uo, then add the arc uup to E(D)). Do the same for edges of the form uv, u £ In(u), 
where an arc of the form vtv translates to the arc vpv. 

• For up, let Out(up) = In(up) := U \ {up}. Expose (in G) edges of the form u*u, u € Out(up) 
with probability q 2 , independently at random, according to a random ordering of Out (up), 
until having exactly k successes. For each success vtu, add an arc vpu to E(D). Do the same 
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for In(up), where now one exposes edges of the form uv o, u E In(up). For each edge of the 
form uv o add the arc uvp to E(D). 

It is relatively easy to show that the probability for not having k successes for at least one of the 
vertices is o(l), and that the choices are made independently and uniformly. Moreover, the number 
of successively exposed edges is clearly 0(km) = o(n). Therefore, D can be seen as a directed 
graph obtained by the following procedure: each vertex picks exactly k in- and k out-neighbors, 
independently, uniformly at random. This model is known as Dk-i n ,k-out and whp contains a 
directed Hamilton cycle (see the main result of [B], or the less complicated one in [5]). Clearly, 
by replacing the vertex up by the path P itself, such a Hamilton cycle corresponds to a Hamilton 
cycle of G. Moreover, note that throughout the algorithm (both phases), each edge has been queried 
at most once with probability q\ (in Phase 1) and at most 4 times with probability (in Phase 2), 
and therefore the resulting graph naturally couples as a subgraph of G ~ Q(n,p). It’s not hard to see 
that if we were to expose with full probability p the edges that have been exposed by this procedure 
with any non-zero probability, the number of additional successively exposed edges would be o(n), 
because of our choices of q\ and qi- This completes the proof of this sketch. □ 

2 Notation 

Most of the notation used in this paper is fairly standard. Given a natural number k and a set 
S, we use [k] to denote the set (1,2and (j)) to the denote the collection of subsets of S of 
size k. 

Given a graph G we use V(G) to denote the set of vertices of G. Moreover, given a subset E of 
the edges of G we shall oftentimes abuse notation and refer to the subgraph of G formed by these 
edges simply by E (with vertex set V(G) unless stated otherwise). 

Given a subset S C V(G), G[5] denotes the subgraph of G induced by the vertices in S, i.e. the 
graph with vertex set S whose edges are the ones of G between vertices in S. Furthermore, we use 
ec(S) to denote the number of edges of the graph G[S]. 

Given a vertex v E V ( G ) and a subset S C V(G), we use Nq(v ) to denote the set of neighbours of 
v in the graph G , Nq(S) := U v& s^g(v) to denote the set of neighbours of vertices in S in the graph G 
and Nq(v , S) := Nq(v) (~l S to denote the set of neighbours of v in the graph G which lie in the set S. 
Moreover, dc(v) := |IVg(u)| denotes the degree of v in the graph G, dc(v, S ) := | Nq(v, S')! denotes the 
number of neighbours of v in the graph G which lie in the set S, A (G) := max„ g y( G ) dciy) denotes 
the maximum degree of the graph G and finally 5(G) := min^y^) dc(v ) denotes the minimum 
degree of the graph G. 

A subgraph P of the graph G is called a path if V ( P) = {v \,..., vg} and the edges of P are V 1 V 2 , 
V 2 V 3 , ..., V£-iV£. We shall oftentimes refer to P simply by V 1 V 2 ■ ■ ■ vg. We say that such a path P 
has length t — 1 (number of edges) and size t (number of vertices). We say that P is a Hamilton 
path (in the graph G) if it has size |V(G)|. Furthermore, a subgraph C of the graph G is called a 
cycle if V(C) = {iq,... ,vp} and the edges of P are V 1 V 2 , V 2 V 3 , ..., V£_\V£ and V£V\. As for paths, 
we shall oftentimes refer to C simply by V 1 V 2 ■ ■ ■ V£. We say that such a cycle has length t (number 
of edges) and size £ (number of vertices). We say that C is a Hamilton cycle (in the graph G) if 
it has size |V(G)|. A trail of length t in G between two vertices x and y is a sequence of vertices 
x = vo, vi ,..., vt = y such that {uqUi, v\V 2 , ■ ■ ■, vt-ivt} is a set of distinct t edges of G. 
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3 Auxiliary results 

3.1 Probabilistic tools 

We need to employ standard bounds on large deviations of random variables. We mostly use 
the following well-known bound on the lower and upper tails of the Binomial distribution due to 
Chernoff (see e.g. in, mi)- 

Lemma 3.1. Let X ~ Bin(n,p) and let fi = E[X]. Then 

• Pr [X < (1 — a)/z] < e~° 2,J '/ 2 for every a > 0; 

• Pr [X > (1 + a)/r] < e~ a2f1 / 3 for every 0 < a < 3/2. 

The following is a trivial yet useful bound. 

Lemma 3.2. Let X ~ Bin(n,p) and k € N. Then the following holds: 

Pr (X>k)<(^) k . 

Proof Pr(X > k) < ( n k )p k < (^) k . □ 


3.2 Properties of random graphs 

We start with the following natural definition for fc-pseudorandonmess of graphs. 

Definition 3.3. A graph G is called ^-pseudorandom if ec(A, B) > 0 for every two disjoint sets A 
and B of size at least k. 

Lemma 3.4. Let k = k(n) be an integer such that 3Inn < k < ^ and let 3ln W fc ) < p < 1 . Then 
whp a graph G ~ G(n,p ) is k-pseudorandom. 


Proof. If G is not ^-pseudorandom then there exist two disjoint sets S and T with |Sj = \T\ = k and 
no edge between them. Note that the probability that there is no edge between a given such pair 
{5, T} is (1 — p) k “ and there are at most (/) such pairs. Thus, applying the union bound over all 
pairs of disjoint sets S, T of size k we obtain that the probability that G is not fc-pseudorandom is 
at most 



(1 ~pf< 


™y k e - P k 2 


^ e 2 n 2 e P k \ k < 


o(l). 


Thus, we conclude that whp a graph G ~ G(n,p) is fc-pseudorandom as claimed. 


□ 


In the following two lemmas we state a few properties of a typical random graph which will be 
used extensively throughout the paper. 

Lemma 3.5. Let p = p{n) E (0,1), let c > 1 be a constant and let C = C(n) > 61n(nplnn). Then 
whp a graph G ~ G{n,p) is such that the following holds: 

(PI) A(G) < 4 np, provided p > 
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c—1 


(P2) e G (X) < c\X\ for any subset X C V(G) of size at most 


(P3) e G (X) < C\X\ for any subset X C V(G ) of size at most 


Proof. For (PI), note that for a vertex v € V(G) we have d G (v ) ~ Bin(n — 1 ,p) and so by Lemma 


Pr [d G (v) > 4 np\ < 


e(n — l)p\^ np ^ fe\ 4lnn f 1 
4 np 


<( 4 


= o | - 

n 


Thus, by applying the union bound over all vertices of G we see that the probability that there exists 
a vertex v € V(G) with d G (v) > 4 np is o(l), settling (PI). 

For (P2), note that for a fixed X C V(G) of size \X\ = x one has e G (X) ~ Bin (( 2 )> p) ■ Therefore, 
by Lemma 13.21 we obtain 


Pr [e G (X) > cx] < 



/exp\ cx 

V2^) 


Applying the union bound over all subsets of V (G) of size at most t = ^ ^ • e if l^pc ^ c 1 we see that 
the probability that there exists a set X of size x < t with e G (X) > cx is upper bounded by 



since e ° +1 ^ This settles (P2). 

For (P3) we proceed in a similar way as with (P2). By (P2), taking c = 2, we know that whp all 
sets X C V(G) of size at most 2 n p\nn sa ^i s fy e c{X) < 2|A| < C|X|. Thus, we just need to show 
that the probability that there exists a set X of size 2n pHnn — x — % with e G (X) > Cx is o(l). 
Indeed, proceeding as above, we see that this probability is at most 


c_ 

2 p 

£ 

1 

2 np^ In n 


en 

x 



2 e(np) 2 In n 



o{ 1 ). 


since C > 6 In (np In n) and (|) 6 < e 2 . This settles (P3). □ 

Lemma 3.6. Let w := w(n) be such that w —>• 00 as n —>• 00 , letp := pin) be such that lnn + lll J im + ,U) < 
V < let G G(n,p) and let C be a fixed positive integer. Then whp all of the following hold: 

(PI) G has minimum degree at least 2. 

(P2) there are no two cycles in G of length at most 4 sharing exactly one vertex. 

(P 3) between any two vertices u, v € V(G) there are at most 3 trails of length at most C. 
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Proof. First we prove that whp (PI) holds. Note that for a fixed vertex (G) we have 


Pr[d G (u) < 2] = (l-p ) n - 1 + (n-l)p(l-p) n - z < 2e~ np + 2npe~ np < 


2 e~ 


nlnn 


+ 20 In n ■ 


n In n 


= o 


where in the second inequality we used the fact that In n + In In n + w < np < 10 In n, and in the 
last step we used the fact that w —>• oo. Thus, taking the union bound over all vertices in V(G) we 
obtain 

Pr [5(G) < 2 ] = Pr [=k; € V(G) such that dc(v) < 2] < n ■ o ^ = o(l), 
implying that (PI) holds whp as claimed. 

Next we show that (P2) also holds whp. Note that if C\ and C 2 are cycles in K n of lengths l\, l 2 , 
respectively, sharing exactly one vertex then \V (C 2 )\ = I 1 +I 2 — 1 and \E(Ci)UE(C 2 )\ = h + fa- 

Thus, using the union bound, we see that the probability that in G we have two cycles of lengths 
l\ and I 2 which share exactly one vertex is at most n ll+l 2 ~ l p ll+l 2 = ^ np ^ 1 —-. Moreover, letting 
k := 4 \nfap) an d taking the union bound over all pairs of cycles of lengths at most k which share 
exactly one vertex, we see that the probability of (P 2 ) not holding is at most 


k 

E 

h,h=l 


[np ) 11+12 ^ k 2 (np) 2k 

< °(f)j 


n 


n 


since ( np) 2k = y/n. Thus, we conclude that (P2) holds whp as claimed. 

Finally we show that whp (P3) holds. Let u, v € V{G) and let Wi, W 2 , W 3 , W 4 be any four 
distinct trails between u and v in K n , each of length at most C. A moment’s thought reveals that 

\v{Wi) u v(w 2 ) u v{w 3 ) u y(w 4 )| < |P(Wi) u e(w 2 ) u e{w 3 ) u p(w 4 )| < 4C. 

Thus, using the union bound we see that the probability of (P3) not holding is at most 

„i-V = £ M < 4C(10]-nn) 4g = ^ 


n 


n 


1=1 1=1 

We conclude that (P3) holds whp, completing the proof of the lemma. 


□ 


3.3 Properties of graphs 

The following simple lemma can be found, e.g., in Chapter 1 of [Tj- 

Lemma 3.7. Let G = (' V. , E) be a graph. There exists S C V such that GfS 1 ] is a connected graph of 
minimum degree at least |P|/|P|. 

The next lemma follows from a simple application of Hall’s Theorem (see e.g. the exercises of 
Chapter 2 in [7j). 

Lemma 3.8. Let G = (V, E) be a bipartite graph with bipartition V = A U B, and let k be a natural 
number. Suppose that for every I C A we have |iV(/)| > k\I\. Then for every i e A there exists a 
subset Ji C N(i) of size |Jj| = k such that all the sets ( Ji)i^A are disjoint. 
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A routine way to turn a non-Hamiltonian graph H that satisfies some expansion properties into 
a Hamiltonian graph is by using boosters. A booster is a non-edge e of H such that the addition 
of e to H creates a path which is longer than a longest path of H, or turns H into a Hamiltonian 
graph. In order to turn H into a Hamiltonian graph, we start by adding a booster e of H. If the 
new graph H U {e} is not Hamiltonian then one can continue by adding a booster of the new graph. 
Note that after at most \V(H)\ successive steps the process must terminate and we end up with a 
Hamiltonian graph. The main point using this method is that it is well-known (for example, see [3]) 
that a non-Hamiltonian connected graph H with “good” expansion properties has many boosters. 
In the proof of Theorem Q] we use a similar notion of boosters, known as e-boosters. 

Given a graph H and a pair e € , consider a path P of H U {e} of maximal length which 

contains e as an edge. A non-edge / of H is called an e-booster if H U {e, /} contains a path Q which 
passes through e and which is longer than P or if H U {e, /} contains a Hamilton cycle that uses e. 
The following lemma appears in [ 8 j and shows that every connected and non-Hamiltonian graph G 
satisfying certain expansion properties has many e-boosters for every possible e. 

Lemma 3.9. Let H be a connected graph for which \Nh{X) \ X\ > 2\X\ + 2 holds for every subset 
X C V(H ) of size |A| < k. Then, for every pair e € ( v such that PI U {e} does not contain a 
Hamilton cycle which uses the edge e, the number of e-boosters for H is at least + l) 2 . 

4 Proof of Theorem |1] 

In this section we prove Theorem [I] In our proof we present a randomised algorithm which 
successively queries (about adjacency) carefully selected pairs of vertices in Q(n,p), where p > 
in it+iiiin »+m(i) . yy e then show that whp the algorithm terminates by revealing a Hamilton cycle 
after only n + o(n) positive answers. The algorithm is divided into five different phases, labeled I, 

II, III, IV and V. 

We remark that we may and will assume throughout the proof that p < 10 ^ nn . Indeed, if p > 10 J ) 1 11 
then we can use the algorithm Alg(p') which queries pairs of vertices with probability p' = 10 j( lw with 
a slight modification to obtain an algorithm Alg(p) which queries pairs of vertices with probability p. 
When a pair of vertices is queried by Alg(p'), do it in two stages: first query it with probability 2- to 
decide whether Alg(p) should query this pair of vertices as well; if the answer is positive, then query 
it a second time with probability p. A pair of vertices which is queried by Alg(p ; ) is considered to be 
an edge if and only if the answer to both questions is positive, and so this happens with probability 
p' = 2- . p. However, in the algorithm Alg(p) pairs of vertices are queried about adjacency with 
probability p. Finally, note crucially that the edges which are revealed by the algorithm Alg(p) are 
exactly the same as the ones which are revealed by the algorithm Alg(p') and so, if the latter whp 
finds a Hamilton cycle after only n + o{n ) positive answers then so does the former. 

In order to simplify notation in the proof, we work in the following setting. Throughout the 
algorithm we maintain a tripartition R U W U B of the edge set of the complete graph with vertex 
set V = [n]. Edges in R, W, B are called respectively red, white and blue. A red edge represents an 
edge which has been queried successfully (and thus belongs to the exposed graph G ), a white edge 
represents an edge which has not yet been queried and a blue edge represents an edge which has been 
queried unsuccessfully. During the algorithm we recolour some white edges. Recolouring a white 
edge means that with probability p we recolour it red (i.e., we move it from the set W to the set R), 
and otherwise we recolour it blue (i.e. we move it from the set W to the set B). All the recolourings 
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are considered independent. At any point during the algorithm, the red graph (respectively, white 
graph and blue graph) refers to the graph with vertex set V and edge set R (respectively, W and 
B ). Moreover, if u, v € V then we say that v is a red neighbour (respectively, white neighbour and 
blue neighbour) of u if uv E R (respectively, uv E W and uv € B). The algorithm starts with the 
tripartition (R, W,B) = ^0, (o),0^ and whp it ends with the red graph containing a Hamilton cycle 
while having only n + o(n) edges. 

During the algorithm, if R denotes the set of red edges at a certain point, and if at that point it 
is verified that any of the events below does not hold then we stop the algorithm: 

N.l We have A (R) < 40Inn. 

N.2 If none of the edges incident to a given vertex v E V are white (i.e. they were already recoloured 
before) then v has at least 2 red neighbours. 

N.3 There are no two cycles in R of length at most (Inn) 0 ' 9 sharing exactly one vertex. 

N.4 Between any two vertices u, v € V there are at most three trails of length at most 6 in R. 

We remark that all of these events hold whp by (HI) of Lemma 13.51 and by Lemma 13.61 and so we 
can assume throughout the proof that these properties always hold. 

In Phases I-IV we consider a finite number of properties concerning the tripartition (R,W,B), 
which we need for later phases. These properties will be labeled according to the phase in which 
they are considered, in order to make it easier for the reader to find them in a later reference. For 
example, II. 1(b) will be used to denote property 1(b) of Phase II. In each phase, we show that all 
the properties considered hold whp and so we may assume that they hold for later phases. 

4.1 Outline of the algorithm 

Phase I: In this phase we use a modified version of the well known graph search algorithm 
Depth First Search (see e.g. E3). Starting from a complete graph with all edges white, we use this 
algorithm to find a “long” red path V by recolouring red at most n — 1 white edges. Afterwards 
by recolouring red one more white edge between “short” initial and final segments of the path V. 
we create a red cycle C\ of size n — ©(n(lnn) -0 ' 4 ' 5 ). This is done whilst ensuring some technical 
conditions needed for later phases. 

Phase II: Let U := V \ V{C\). In this phase, starting from the partition V = V(C\) UW, 
we recolour a random subset of white edges in U. and partition U into three sets EXPi, SMALLi 
and TINY. The set EXPi will be such that the minimum red degree inside it is D ((Inn) 0 ' 4 ) and 
|U \ EXPi| = o(n/lnn). Later, we recolour all the white edges between U \ EXPi and V(C\). A 
partition U \ EXPi = SMALLi U TINY is then obtained by letting SMALLi be the set of all vertices 
of “large” red degree into a “large” subset oiV{C\). Finally, we recolour all the white edges 
inside U touching vertices of TINY. All of this is achieved by recolouring red only o(n) edges during 
this phase. 

Phase III: The goal of this phase is to “swallow” the vertices of TINY one at a time into the red 
cycle C\ obtained in Phase I. This is achieved by creating at each time a larger red cycle that contains 
a new vertex of TINY and some vertices of EXPi, until a red cycle C 2 such that V (Ci)UTINY C V(C 2 ) 
is obtained. We ensure that whp only o(n) edges are recoloured red during this phase. At the end of 
this phase we get a partition of the vertex set V = V(C 2 ) U EXP 2 U SMALL 2 where EXP 2 C EXPi 
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is a “good expander” and SMALL 2 C SMALLi is a set of vertices of “large” degree into a “large” 
subset M .2 C M.\ of V{C 2 ). 

Phase IV: The goal of this phase is to “swallow” the vertices of SMALL 2 one at a time into 
the red cycle C 2 obtained in Phase III. This is achieved by creating at each time a larger red cycle 
that contains a new vertex of SMALL 2 , until a red cycle C3 such that V(C 2) U SMALL 2 C V(C-^) is 
obtained. We ensure that whp only o{n) edges are recoloured red during this phase. Moreover, at 
the end we get a partition of the vertex set V = V(Cs) U EXP3 where EXP3 C EXP2 is a “good 
expander”. 

Phase V: In this phase we create a Hamilton cycle in the red graph by merging the red cycle 
C3 with the set EXP3. We start by recolouring red 0 ( 1 ) white edges in EXP3 in order to make the 
red graph in EXP3 become connected. Afterwards, we consider two adjacent vertices v. w in the red 
cycle C3 which have large white degree onto EXP3. By recolouring edges between v,w and EXP3 
we then find x, y € EXP3 such that vx and wy are coloured red. Finally, using the fact that the red 
graph in EXP3 is a connected expander we recolour red at most IEXP3I edges in order to find a red 
Hamilton path from x to y inside the set EXP3. This path together with the red path C3 \ {vw} and 
the red edges vx and wy then provides the desired red Hamilton cycle in V. All of this is achieved 
by recolouring red only o(n) edges during this phase. 

4.2 Phase I 

The algorithm for this phase is divided into two stages. In Stage 1 we use a randomised version 
of the Depth First Search exploration algorithm to obtain a “long” red path V. In Stage 2 of the 
algorithm we use the red path V to find a red cycle C\ of size n — 0(n(lnro)~ 045 ), by recolouring red 
exactly one white edge between an initial and a final interval of V. 

(Stage 1) In this stage we run a (slightly modified version of) DFS algorithm on the vertex 
set V = [n]. Recall that DFS is an algorithm to explore all the connected components of a graph 
G = ( V,E ), while finding a spanning tree of each of them in the following way. It maintains 
a tripartition ( C , A, U ) of the vertex set V , letting C be the set of vertices whose exploration is 
complete, U be the set of unvisited vertices and A = V \ {C GJJ) (the vertices which are “active”), 
where the vertices of A are kept in a stack (last in first out). It starts with C = A = 0 and U = V 
and runs until AGU = 0. In each round of the algorithm, if 4/0, then it identifies the last vertex 
a € A, and starts to query U for neighbours of a, according to the natural ordering on them. If 
such a neighbour exists, let u € U be the first such neighbour, then the algorithm moves u from U 
to A. Otherwise, the algorithm moves a from A to C. If A = 0, then the algorithm moves the first 
(according to the natural ordering) vertex in U to A. 

The following properties of DFS will be relevant for us and follow immediately from its description. 

(Ol) At any point during the algorithm, it is true that all the pairs between C and U have been 
queried, and none of them are edges of G. 

(02) Throughout the algorithm, the explored graph is a forest. 

(03) At each round of the algorithm exactly one vertex moves, either from U to A or from A to C. 
(04) The set A always spans a path. 
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For our purposes, we run DFS on a random input in the following way. At each round of the 
algorithm let e be the relevant pair waiting to be queried. We first decide with probability q = 
(lnn)^ 1 / 2 whether we want to recolour this pair. If yes, we recolour the pair e red with probability 
p (and consider it as an edge for DFS) and blue otherwise (and consider it as a non-edge for DFS). 
If no, we consider e to be a non-edge for DFS. All these actions happen independently at random. 
We stop this algorithm as soon as |C| = \U\ (and not when A U U = 0), with the set A spanning a 
red path V = a±a 2 ■ ■ ■ cl\a\ of size \A\ = n — \C\ — \U\. Claim 1X11 below ensures that whp at the end 
of the routine one has k := n(lnn) -0 ' 45 > \C\ = |C/|, so that |A| > n — 2k. We assume henceforth 
that this holds and we proceed to Stage 2. 

(End of Stage 1) 

(Stage 2) Consider the intervals h '■= {ai,a 2 , • • •, a k } and I 2 := {a n _ 3 fc+ i, a n _ 3fe+2 ,..., a n _ 2k } 
of V and let D be the set of white edges between I\ and I 2 . Following a fixed ordering of the set 
D, at each step first decide with probability q whether this edge should be recoloured. If yes, then 
recolour it red with probability p (and blue otherwise). All these actions are taken independently. 
The stage is completed successfully when the first red edge from D is discovered. Claim 14.11 below 
ensures that whp there will be such an edge. Assuming this, let ajOj € D be this edge, where € I\ 
and dj € I 2 . The algorithm terminates by setting C\ to be the red cycle formed by the vertices 
a*, Oj+i,... ,a,j with the red edges of V together with the red edge 04 aj. 

(End of Stage 2) 

In the next claim we prove that some properties which are assumed in the algorithm hold whp. 
Claim 4.1. The following properties hold whp: 

( i ) At the end of Stage 1 we have \C\ = \U\ < k := n(lnn) -0 - 45 . 

(ii) During Stage 2, at least one edge in D is recoloured red. 

Proof of Claim \f~T\ In order to prove (*), note that during the algorithm, each pair which has been 
queried has been recoloured red with probability pq > n(lnn) -1 / 2 , independently at random. More¬ 
over, it follows from (Ol) that none of the pairs between C and U have been coloured red. Note 
that by exploring all the edges of the graph we obtain a graph which is distributed as Q(n,pq). Now, 
since 

(Inn ) 0 " 5 1.35(ln n ) 0 ' 45 In Inn 3ln (n/k) 

pq > ii- = - \J-L 

n n k 

it follows from Lemma[33]that this graph is whp fc-pseudorandom and therefore, unless \C\ = \U\ < k, 
there must be red edges between these sets. 

Property (ii) follows from a similar argument. □ 

Assuming that the properties of Claim l4~T1 hold, denoting by R\, W\ and B\ the sets of red, white 
and blue edges, respectively, at the end of this phase’s algorithm, we show that whp the following 
technical conditions hold: 

1.1 the graph R\ contains a cycle C\ of size t, where 2n(lnn ) -0 ' 45 < n — t < 4n(lnn)~°. 

1.2 at most n white edges are recoloured red during this phase, i.e. |i?i| < n. 

1.3 for every v£fwe have d\y x (v, V(Ci)) > n — 5n(lnn )~ 015 = (1 — o(l))n. 
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1.4 letting U := V \ V(C\), we have d Rl \ jRl (v,U) < 4n(ln n) 0,5 = o(\U\) for every v € V. 

Claim 4.2. Properties I.l-I.f hold (whp). 

Proof of Claim \4-'2\ Note that the red cycle C\ formed by the vertices Oj, aj + i,..., a j obtained at the 
end of the algorithm has size t := j — i + 1. Moreover, assuming Claim 1X11 since 1 < i < k and 
n — + 1 < j < n — 2k we obtain that n — 4k<n — 4k + 2<j —i + l = t<n — 2k. This settles 1.1. 

Since a forest in a graph of order n has less than n edges, it is clear by (02) that in Stage 1 
of the algorithm less than n edges are recoloured red. Moreover, since in Stage 2 only one edge is 
recoloured red, it follows that in the whole phase at most n edges are recoloured red, settling 1.2. 

Next, note that R\ U B\ can be seen as part of a graph distributed as Q(n,q). Thus, by (PI) of 
Lemma 13.51 it follows that 1.4 holds whp. Furthermore, by 1.1 and 1.4 we have that for every v € V 

d Wl (v,V{Ci)) > |V(Ci)| - 1 -^ lUBl (u,F(Ci)) 

> n — 4n(lnn)~ 0 ' 45 — 1 — A(/?i U B\) > n — 5n(lnn) -0 ' 45 


which settles 1.3. □ 

We have shown that whp at the end of this phase all the properties 1.1-1.4 hold. We shall assume 
henceforth that they hold for the sets R \, W\ and B\ obtained after this Phase. 


4.3 Phase II 


In this phase we partition U into three sets U = EXPi U SMALLi U TINY as described in the 
outline. The algorithm for this phase is divided into the following three stages. 


(Stage 1) Let F be a subset of W\ [U] obtained by independently adding each edge in Wi[U] to 
F with probability q' := 6 (lnn) -0 ' 15 . Claim l4~3l below ensures that whp | \U\q’ < 5(F) < A (F) < 
^\U\q'. Assuming this, recolour all the edges in F. 

Taking the set F at random serves two purposes. Firstly, it ensures that not too many edges are 
recoloured red in this phase. Secondly, it leaves a certain amount of randomness for the edges in 
W\ UA] \ F, which will be used in later phases. 

(End of Stage 1) 

(Stage 2) Let R 1 denote the set of red edges after Stage 1 and set To := {n E U : d R i\ Rl (v,U) < 
^\U\pq'}. Claim I4~3l ensures that whp |To| < ne^( lnn ) 0,4 . Assuming this, starting with i = 0, as long 
as there exists a vertex v GlA\Ti having at least 3 red neighbours in T), choose such a vertex v and 
set Tj_|_i := Tj U {v}. Let Tf be the last set obtained in this process. Claim l4~3l below shows that 
whp / < |To|. Assuming that, define EXPi :=U\Tf. 

Note that every vertex v € EXPi has at most two red neighbours in Tf. Thus, by 1.1 for every 
v € EXPi we have d R i\ Rl (v, EXPi) > \\U\pq' - 2 > 3(lnn) a4 . ^ gtage 


m = 


(Stage 3) Let Pi, ..., P m be vertex disjoint subpaths of the red cycle Ci, each of size 100, where 
, and set M\ to be the union of all the vertices in the subpaths Pi,..., P m which are 
not endpoints. These paths will be used in later phases for technical reasons to ensure that certain 
vertices are not neighbours on the red cycle C\. 

Next, recolour all the white edges between Tf and V(Ci), set SMALLi to be the set of all vertices 
in Tf with at least (Inn ) 0 " 5 red neighbours in A4i and set TINY := Tf\ SMALLi. The algorithm 
for this phase ends by recolouring all the edges in W\ \U\ \ F touching at least one vertex in TINY. 
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(End of Stage 3) 

In the next claim we prove that some properties which are assumed in the algorithm hold whp. 


Claim 4.3. All of the following properties hold whp: 

(i) In Stage 1 one has \\U\q' < 5(F) < A(F) < ^\U\q'. 
(ii) In Stage 2 one has |Tq| < rae _ ( lnri ) 0 ' 4 and f < |Tq|. 


Proof. First we prove that (i) holds whp. For estimating 5(F), note that by 1.4 we have dw 1 (v,ll) = 
(1 — o(l))|W| for every Thus, by Chernoff’s inequality we conclude that for a vertex v E U we 

have 


Pr 


d F (v) < U\q' 


< e -6(|W|g') = e -0(n(lnn)-°- 6 )_ 


Now, by applying the union bound over all vertices of v € U, it follows that whp 5(F) > ^\U\q'. In 
a similar way, we can also conclude that whp A (F) < ^\U\q'. This settles (i). 

Assuming (i), we show next that whp |To| < jje^( |nn ) 0,4 . Indeed, by Chernoff’s inequality we see 
that for every v € U\ 


Pr [v € Tq] = Pr 


dmXR^vM) < g|W| pq' 


< Pr 


Bin | ^\U\q',pj < i \U\pq ' 


< e -(lnn) 0 - 4 ; 


where in the last inequality we used the fact that \U\pq' > 12(lnn )°' 4 by 1.1. Therefore, the expected 
value of |To| is at most |£/| e ~( lnn ) . Hence, using Markov’s inequality we obtain that whp |To| < 

ne -(inn) ^ ag d es i rec p 

Finally, we show that if |To| < ne^^ nr ^° A then whp / < |To|. Suppose that / > |To|. Then, note 
that the set T| To i contains precisely 2|To| vertices and induces at least 3|To| = 1.5|T] ro || red edges, 
since for every i < |To| the vertex in T % \ Tj_ \ has at least 3 red neighbours in Tj-\. By (P2) of 
Lemma 13.51 (with c = 1.5), since p < 1 0 n , we know that whp every subset of vertices X of size 

\x\< ( 1 315 V < 33 n 

~ \lnn e 2 - 5 np 15 ) — e 5 • 10 3 (Inn ) 5 

induces less than 1.5|X| red edges. Since |Tjy 0 i| = 2|To| < 2ne~^ nn ' ) ° A = o(n(lnn) -5 ) it follows that 
whp / < |To|. Thus, we conclude that whp (ii) holds as claimed. □ 

Assuming that the properties of Claim H~3l hold, denoting by i? 2 , W 2 and B 2 the sets of red, white 
and blue edges at the end of this phase, we show that whp the following technical conditions hold: 


II. 1 Properties of the set EXPi: 

(a) lEXPxl > (l- (eV) |W| > ( 2 -o(l))n(lnn)- 0 - 45 . 

(b) for every v € EXPi we have d F2 \ rii (v, EXPi) > 3(ln?r) 0 - 4 . 

(c) for every set U C EXPi of size \JJ\ > (l — ^) |EXPi|: 

i. if S C U is a set such that (R 2 \ T?i)[S'] has minimum degree at least (Inn ) 0,4 then 
for any set X C S of size |X| < gjijQn(lnn )' 0 - 45 we have |AI R 2 [ 5 ](A) U X\ > 5|A|. 
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ii. there is a set S C U of size |5| > ^n(lnn) °' 45 such that has diameter at 

most 2 In n. 

(d) Let T-L be the collection of all connected graphs H whose vertex set V(H) is a subset 
of EXPi and whose edge set is of the form K± U K 2 where K\ = R, 2 [V(H)\ is such that 
\Nk 1 (X)UX\ > 5|Y| for every X C V(H) of size |Y| < gYj?i(lnn)~ 0 ' 45 and K 2 C 
is a set of size \K->\ < \V(H)\ + 24000. Then whp for every H E PL and every e = xy € 
(^* 2 ^)’ ^ the g ra ph H U {e} does not contain a Hamilton cycle which uses the edge e, 
then the number of e-boosters for H in the set W 2 [V(H)\ is at least 10 _8 n 2 (ln n)~ 0 9 . 

11.2 Properties of the set SMALLi: 

(a) |SMALLi| < 2ne~ { - lnn ^ A . 

(b) d R 2 (u,Mi) > (Inn) 0 " 5 for every vertex u € SMALLi. 

11.3 Properties of the set TINY: 

(a) |TINY| < n 0 - 04 . 

(b) the event “there is no red path of size at most 1000 between any two vertices of TINY 
after recolouring all the edges in W 2 holds whp. 

(c) d R2 (it) > 2 for every vertex u E TINY. 

11.4 Only o(n) edges in W\ are recoloured red during this phase. 

It is clear that properties II.1(a), II.1(b), 11.2(a) and 11.2(b) follow immediately from the algorithm. 
Moreover, note that at the end of this phase all the edges touching vertices in TINY are either red 
or blue. Thus, since the event N.2 holds whp it follows that property 11.3(c) also holds whp. The 
next few claims ensure that the remaining properties all hold whp. 

Claim 4.4. Property II. 1(c) holds whp. 

Proof of Claim Consider the following two events: 

(El) e R2 \ Rl (S ) < C(lnn)°- 4 |S'| for any set S C EXPi of size |S| < ^n(lnn) -0 " 45 , for C E 

(E2) e R2 (X,Y) > 0 for every pair of disjoint sets X,Y CU each of size at least gggQn(lnn) -0 45 . 

We shall prove that these two events all hold whp and then that II. 1(c) holds whenever N.l, II. 1(b), 
(El) and (E2) all hold. 

The event (El) holds whp according to (P3) of Lemma [3. 5 1 (with C in the Lemma being C (In n) 0A 
here) since in Stage 1 every edge in W\\U\ is recoloured red independently with probability pq' < 
6 °(i n ”) —. Next we show that the event (E2) also holds whp. By 1.4, after Phase I we have 
e\y 1 (Y, Y) > (1—o(l))s 2 for any pair of disjoint sets X,Y YU each of size at least s := gYjn(ln n) -0 - 45 . 
Thus, by applying Chernoff’s bound and the union bound we obtain 

Pr [3 such sets X,Y with e R2 (X,Y) = 0] < ^ n ^(l — pg')^ 1 ” 0 ^ 1 ^ 52 

< ^fCj 2S e -C-o(l))pq's 2 < e O(n(lnn)-°- 45 lnlnn) e -n(n(lnn)-° 05 ) _ 
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implying that (E2) holds whp as desired. 

Suppose now that N.l, II.1(b), (El) and (E2) all hold. Let U C EXPi be a subset of size 
\U\ > (l — ^) |EXPi| and denote R 2 \ R\ simply by R'. 

Suppose S C U is such that R'[S ] has minimum degree at least (Inn ) 0 - 4 and that there exists a 
set X C S of size |X| < g^^n^nn ^ 0 ' 45 such that \X ri i (X) U X| < 5|A'|. Since R'[S] has minimum 
degree at least (Inn ) 0,4 it follows that 

e R '(N R , [S] (X)UX) > ^(lnn)°- 4 |X| > ^(lnn) 0 A \N R , [s] (X) U X\. 

Thus, by (El) we have | (X) U X\ > y^Qn(lnn) -0 - 45 , which leads to |A"| > gj^jjn(lnn)^ 0 - 45 , 
contradicting our choice of X. We conclude that for every set X C S of size |X| < gjj^n(lnn ) -0 - 45 
we have |ATr 2 [ 5 ](X) U X\ > |X^/[ 5 ](X) U X\ > 5|A|. This settles i. of II.1(c). 

Observe now that by N.l and II. 1(b) we have 

o Ipvp I 

e R '{U) > ejy(EXPi) - A(R') • lEXPi \U I > -(lnn) a 4 |EXPi| - 40 Inn • H > (lnn) a4 |[/|. 

2 Inn 

Thus, by Lemma 13.71 there exists a set S C U such that R'[S] is a connected graph with minimum 
degree at least (Inn) 04 . In particular, we have 

e^(5)>i(lnn) a 4 |5| 

and so by (El) it follows that |S| > ^n{\nn)~ GAb . 

For z € S, set N°(z) := { z } and, for i > 1, define N l (z) := N^s^N 1-1 (z)) U W _1 ( 2 :). 
Note crucially that every vertex in N l (z ) is at distance at most i of z in R /2 [S'] and that, by the 
above, we have |Af l (z)| > 5|N‘ l_ 1 (z)|, provided |A r *' 1 (^)| < —^n(lnn) -0 - 45 . Thus, if we take 
£ := log 5 (gjfjQn(lnn) -0 - 45 ), we see that \N e (z)\ > g^gn(lnn) -0 - 45 for any z £ S. Now, let x,y G S 
be distinct. Note that if N e (x) 0 N e (y) =4 0 then x is at distance at most 2£ from y in ALfS']. If 
instead we have N^(x) 0 N\y) = 0 then by (E2) there is at least one edge in R 2 between N*(x) and 
N e (y), implying that x and y are at distance at most 21 + 1 in R/ 2 [5']. Since 2£ + 1 < 2Inn, this 
settles ii. of II. 1(c). We conclude that II. 1(c) holds whp as claimed. □ 


Claim 4.5. Property II. 1(d) holds whp. 

Proof. Note first that by Lemma 13.91 given such an H G "H and e G ( l , the number of e-boosters 
for H in ( V is at least n 2 (In n) 0 9 . Moreover, since e-boosters of H are not edges of H 
and since all the edges in R 2 \V(H)\ are edges of H, it follows that every e-booster for H is either in 
B 2 [V(H)\ or in W 2 [V(H)\. Note that by properties 1.4 and 1.1 we have 

e B 1 (V(H)) < i • 4?r(lnn ) -0 ' 5 • \U\ < 8 n 2 (lnn ) -0 ’ 95 = o (n 2 (lnn)~ a9 ) . 

Furthermore, observe that for every e-booster of H which is not in Bi[V(H)\, the probability that it 
is in W 2 [V(H)] is at least the probability that it does not belong to F and so at least 1 — q' = 1 —o(l). 
Moreover, it is clear that the latter events are independent. Thus, the probability that less than 
10~ 8 n 2 (lnn )~ 0 - 9 e-boosters for H are in the set W2[V(H)\ is at most 


Pr 


Bin ( (1 — o(l)) 


1 


72 • 10 6 


n 


(Inn) 09 ,1 —o(l)) < (1 — o(l ))10 a n z (lnn) 


- 8 „ 2 / 


\—0.9 


< 


= -0(n 2 (lnn)-°- 9 ) 
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by Chernoff’s bound (Lemma 13.II) . 

Suppose now that I-R2I < 20nlnn. Assuming this, it is clear that any graph in 77 has at most 
20 n In n + n edges and so 


20 n In n+n 

m< E 


i =0 


n 


< (20n In n + n + 1) 


n 


20 n In n + n 


glOOn(lnn) 2 


Thus, using the union bound we see that the probability that, for some H € 77 and some e = xy € 
suc h that the graph H U {e} does not contain a Hamilton cycle which uses the edge e, the 
number of e-boosters for H in the set W 2 [V(H)\ is less than 10” 8 n 2 (lnn)~°' 9 , conditioning on the 
fact that |i?2| < 20nlnn, is at most 

e lOOn(lnn) 2 n 2 e -0(n 2 (lnn)-°- 9 ) _ 


Since whp we have |i?2| A 20n In n according to N.l, the claim follows. 


□ 


Claim 4.6. Properties 11.3(a) and 11.3(b) hold whp. 


Proof of Claim [J76 [ Note, by the definition of the set A4i, that |A4i| > 0.98|H(Ci)| — 100 and recall 
that by 1.3, at the end of Phase I, for every v € V we have dwi (v, V(Ci)) > n — 5n(lnn) -0 ' 45 . 
Therefore, for every v E V we have (say) dw x (v, Mi) > 0.97 n. Note that for a vertex v E Tf we have 
dn 2 \R i(v,A4i) ~ Bin(dvVi(^)7Wi),p). Thus, since Inn < np < 10Inn, it follows that for any vertex 
v € Tf. 


(Inn ) 0 


Pr[u € TINY] < Pr[Bin(0.97n,p) < (Inn) 0 ' 5 ] < 


»=0 


n 


P*(l ~P) 


0.97 n-i 


< (1 ~P) 


(Inn ) 0 - 5 

0.97n 

i=0 


enp 


i{ 1 ~P) 


< e 


-0.97 Inn 


((Innf 5 + l)( 2 enpf nn) ° 5 < n 


-0.96 


and therefore E[|TINY|] < \Tf\n 096 < 2n 0,04 e ( lnn ) 0,4 (recall that whp \Tf \ < 2ne ( lnn ) 0,4 ). There¬ 
fore, we conclude by Markov’s inequality that whp |TINY| < n 0,04 , settling 11.3(a). 

Now we show that 11.3(b) also holds whp. Let R 2 and IT 2 denote the sets of red and white edges 
after Stage 2. We assume throughout that A (R 2 ) < 40 ln n (which holds whp by N.l). Suppose that 
in Stage 3 instead of only recolouring all the edges in W 2 between Tf and V(C\) we had decided to 
recolour all the edges in W 2 . Let R denote the set of red edges after this recolouring process. We 
would like to stress at this point that the edges in R 2 C R are considered to be fixed and only the 
edges in W 2 are regarded as being randomly and independently assigned to R with probability p. 

Let V be the set of all paths in I\ n of size at most 1000. For each P G V consider the indicator 
random variable Xp of the event that P is a path in the graph formed by the edges in R. Finally, 
let X = ^2 Pe p Xp denote the total number of paths in V which are paths in R. Note that for each 
v € V, the number of paths in V starting with v which are also paths in R is at most A(i?) + A(72) 2 -|- 
... + A(i?) 1000 < 1000 • A(i?) 1000 . Thus, it is clear that X < lOOOn • A(72) 1000 and so we have 

E [Y|A(7?) < (lnn) 2 ] < 1000n(lnn) 2000 < n 1 ' 1 . 

Moreover, note that as A (R 2 ) < 40lnn we have by Lemma 13.21 


Pr [A (R) > (lnn) 2 ] < Pr [A (R \ R 2 ) > 0.5(lnn) 2 ] < E Pr [dn\R 2 (v) > 0.5(lnn) 2 ] 

v£V 
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/ 1D 1 \ 0.5(lnra) 2 

< nPr rBin(n,p) > 0.5(lnn) 2 ] < n ( - ^ nn „ ) < e - ( lnn ) 2 . 

\0.5(mn) / y 

For every pair of vertices {u, u} C Tf let V u , v C P be the collection of paths in K n of size at most 
1000 with endpoints u and v. For every u,v €.Tf and P E V u .v consider now the families 

Au, v := {A C W 2 : u, v E TINY if R = R 2 U A} and B P := {B C W 2 : E(P ) C R if R = R 2 U B}. 

Observe that the families A u ,v and Bp are monotone decreasing and monotone increasing in the 
universe W 2 , respectively. Furthermore, note crucially that the event “u,v E TINY” is exactly the 
event “R \ R 2 E A u , v " and that the event U E(P) C R" is exactly the event U R \ R 2 € B P ". Since 
each edge in W 2 is in R \ R 2 independently with probability p it follows from Theorem 6.3.2 of [I] 
that for every u,v E Tf and P E P u ,v we have: 

Pr [u, v E TINY and E(P) C R\ < Pr [u, v E TINY] • Pr [E(P) C R ]. 


Thus, using the union bound and the estimates above, the probability that there exist u, v € TINY 
and P E V u , v for which E(P ) C R is at most 


E E Pr [it, v E TINY and E(P) C R] < E E Pr [it, v E TINY] • Pr [E(P) C R] 

{u,v}CT f PeTu,v {«,■!)}CTf PeVu,v 

< ^2 ( n_0 ' 96 ) 2 • ( Pr [ E ( p ) E < (Inn) 2 ] +Pr [A (R) > (Inn) 2 ]) 

{u,v}CT f PeV u ,v 

< n~ im ^ J2 ( E 1 X p\ A ( R ) ^ ( lnn ) 2 ] + e _(lnn)2 ) 

{u,v}CT f P£Vu,v 

< n~ im (e [X\A(R) < (Inn) 2 ] + e ~ (hin)2 |P|) < n~ im (n 11 + e- (lnn)2 1000n 100 °) = o(l), 

where in the second inequality we used the fact that the events u € TINY and v E TINY are 
independent. This settles 11.3(b). □ 

Claim 4.7. Property II.4 holds whp. 


Proof of Claim 4- 7 In this phase edges are recoloured once in Stage 1 and in two instances in Stage 


3. We shall bound the number of edges recoloured red in these three instances. 

In Stage 1 we recoloured all the edges of F, of which there are at most (F) ■ \IA\. Since by the 
algorithm we have whp that A (F) < ^\U\q' we conclude using Chernoff’s bounds and 1.1 that whp 
the number of edges recoloured red in Stage 1 is at most 

A (F) -M-P< \w • Mp = O ( (ln ” )0 . 05 ) = o(n) . 


In Stage 3 we recoloured all the white edges between Tf and V(C\) and all the white edges 
touching vertices of TINY. Since, from the algorithm, \Tf \ < 2|To| < 2 ne~^ nn ' , we conclude from 
Chernoff’s bounds that whp the number of edges between Tf and V(C i) which are recoloured red is 
at most 


2|J)| • |m)l -P < 4n e -<'“”> 0 - 4 ■ n -p = O = «(n) . 
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Finally, since there are at most |TINY|n edges touching vertices of TINY, we can use Chernoff’s 
bounds together with 11.3(a) to conclude that whp the number of edges touching vertices of TINY 
which are recoloured red is at most 

2|TINY| ■ n ■ p = O (n 0 ' 04 Inn) = o(n) . 

Thus, whp o(n) edges in W\ are recoloured red in this phase, proving the claim. □ 

We have shown that whp at the end of this phase all of the properties II. 1—II.4 hold. We shall 
assume henceforth that all these properties hold for the sets R 2 , W2 and B 2 . 


4.4 Phase III 

In this phase, we want to find a red cycle C 2 containing TINY U V(C\) as described in the outline. 
Recall that in Stage 3 of Phase II we nearly decomposed the red cycle C\ into m = ' red paths 

Pi,..., P m , needed for technical reasons for later phases. We ensure in this phase that the red cycle 
C 2 will be such that most of these paths are also paths in C2. Concretely, we obtain a set J C [m] of 
size | J\ > (1 — o(l))m such that all the paths (Pj)jeJ are paths in the red cycle C2. At the end of 
this phase we get a partition of the vertex set V = V (C2) U EXP2 U SMALL2 where EXP2 C EXPi 
is a “good expander” and SMALL2 C SMALLi is such that every vertex v € SMALL2 has “large” 
red degree onto the set .M2 which is the union of all the vertices of the paths (Pj)j^j which are not 
endpoints. 

The algorithm for this phase is divided into t = |TINY| parts. For each i £ [t] we dehne during 
Part i the following sets: 

• R l and W l which denote, respectively, the sets of all edges which are coloured red and white 
at the end of Part i. 

• EXP* C EXPi, SMALL 1 C SMALLi and U l C U, the latter being the union of EXP*, SMALL* 
and t — i vertices of TINY. 


During Part i, we recolour “some” edges in IF* -1 in order to obtain a red cycle C l (i.e. consisting 
solely of edges in R l ) such that V(C l ) = V \ U 1 contains V(C\), i vertices from TINY and “few” 
vertices from U. During this phase’s algorithm we keep track of which red paths (Pj)j'eM are also 
paths in the red cycle C 2 ■ To this end, we use the function j :V —»• {0,1,..., m} dehned as: 


(j if v G V ( Pj ) for j € [m] 

lo if^U”Li y (Pj) 


During Part i we maintain a set J* C [m] such that for every j £ J* the path Pj is a path in the 
cycle C l . The algorithm for this phase is as follows: 

Algorithm: Fix an enumeration x\,X 2 , ■■■ ,Xf of the vertices in TINY, where t = |TINY|, and 
set C° := Ci, U° := U, EXP 0 := EXPi, SMALL 0 := SMALLi, R° := R 2 , IF 0 := W 2 and J° := [m\. 
For i = 1,2,... ,t execute the following routine which shows how to add x, L to the red cycle C* -1 : 

Routine: Recall from 11.3(c) that dn 2 (xi) > 2. Thus, since R 2 C exactly one of the 

following holds: 


(a) d R i-i(xi,V(C ' 1 4 )) > 1 and d R i-i(xi,U l x ) > 1. 
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(b) d R i-i(xi,V(C l x )) > 2 and d R i-i(xi,U l x ) = 0. 

(c) d R i-i(xi,V(C l ~ 1 )) = 0 and d R i~i (x,,ZY* -1 ) > 2. 


We proceed depending on which of the cases above holds. For each of these cases we consider two red 
neighbours of Xi and depending on whether they lie in V{C l ~ l ) or in U * -1 we use them in a certain 
way to incorporate Xj into the red cycle C* -1 . For the sake of simplicity, we only describe here how 
to proceed if (a) holds. However, we stress that this case contains all the ideas necessary for treating 
the other two cases. Essentially, for case (b) (resp. (c)) the two red neighbours of x, considered 
should be treated as the red neighbour of x, in case (a) which lies in P(C* -1 ) (resp. U * -1 ). If case 
(a) holds, proceed as follows: 

Fix a cyclic enumeration v\v 2 ...vi of the vertices in the red cycle C* -1 , where £ = |P(C* -1 )| 
(indices considered modulo £), and let z\ G V{C l ~ l ) and z\ G U* -1 be two red neighbours of x ,. 
Without loss of generality we assume that z\ = V£. 

Set BAD,; := {v G EXP* -1 : d R2 (v, EXP* -1 ) < 3(lnn)°' 4 } to be the set of all vertices in EXP* -1 
with “low” red degree inside EXP* -1 and set GOODj := EXP* -1 \ (BAD,; U Wj 2 (BADj) U {z 2 }). Let 
Si C GOODj be a subset of size at least ^^(lnn) -0 ' 45 such that I?* -1 [5,] is a connected graph of 
diameter at most 2 In n. Claim 14.81 below ensures that whp such a set S, exists and we assume this 
henceforth. 

For each Vj G P(C* -1 ) define s + (xj) := Vj + \ and s~{vj) := Vj— i to be the “successor” and 
“predecessor” of Vj in the cycle V(C l ~ l ) (notice that s + (zj) = s + (t^) = xi). Recolour all the 
edges in W* -1 between {s + (z\),z 2 } and V(C\) and, letting R\ and W\ denote, respectively, the 
sets of red and white edges at this point, consider the sets A\ := N r ^(s + (z\),V(Ci)) \ {z\} and 
A 2 := N R ^(z 2 , V(Ci)) \ {z\, s + (^)}. Note that for any two vertices v a € A\ and v b G A l 2 we have a 
red path 


P(v a ,v b ) 


f Va-lVa-2 ■ ■ ■ V 2 VlV a V a+ l . . . V b -iV b Z 2 XiVlVl-i . . . V b+ \ if 1 < a < b < l 
\v a -lVa-2 ■ ■ ■ V b+ \V b z\x % VlVl^\ . . . V a+ iV a ViV 2 ■ ■ ■ V b -\ if 1 < b < a < l 


from s~(v a ) = v a ~i to either s + (x;,) = v b+ \ or s~{v b ) = v b -\ such that V(P(v a ,v b )) = P(C* -1 ) U 
{xj, z 2 }. Dehne B\ := (s - (x) : v G A \} to be the set of possible initial vertices of these paths. 

Recolour all the edges in W{ between vertices in B\ and S', and, letting R 2 and W 2 denote, 
respectively, the sets of red and white edges at this point, let y\ G B\ and u\ G 5, be such that 
y\u\ G R 2 . Claim ITTHl ensures that whp such vertices exist and we assume this henceforth. Define 
now B 2 to be the set of possible final vertices of the paths P(s + (y\),v) where v G A 2 . 

Recolour all the edges in W 2 between vertices in B 2 and S, and, letting R l and W l denote, 
respectively, the sets of red and white edges at this point, let y 2 G B 2 and u 2 G 5, be such that 
y 2 u l 2 G R l . Claim [4~8l ensures that whp such vertices exist and we assume this henceforth. Moreover, 
let s(y 2 ) G {s + (y 2 ), s~(y 2 )} be the vertex of P(C* -1 ) such that y l 2 is the hnal vertex of the red path 
P(s + (y\),s\y l 2 )). 

Let P(u\,u 2 ) be a path inside S', from u\ to u 2 of length at most 2 In n consisting solely of edges 
in R l (such a path exists by the choice of S',) and set C* to be the red cycle formed by joining 
the red paths P(s + (y\), s(y 2 )) and P(u\,u 2 ) with the red edges y\u\ and y 2 u 2 . Furthermore, set 
J* := J* -1 \ ( {j (z\)} U {j (y\ )} U {j (y 2 )}) to be the set of indices obtained by deleting the indices of 
the paths we “broke” during this routine, and note that every path Pj with j G J* is still a subpath 
of the red cycle C* (provided it was also a subpath of C* -1 ). Finally, set EXP* := EXP* -1 \ V(C l ), 
SMALL* := SMALL* -1 \ H(C*) and U l := ZP -1 \ R(C*)- 
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(End of Routine) 

To end the algorithm set EXP2 := EXP £ , SMALL2 := SMALL*, C 2 := C*, J := J* and M 2 to be the 
union of all the inner vertices of the paths (P-i)iej- 

(End of Algorithm) 

We make a few observations about the procedure above which will be important for later: 

(01) For every i £ [t] we have |EXP* -1 \ EXP*| < 2Inn + 3. This is because in Part i we have 
EXP 4-1 \ EXP* C V(P(u\,u l 2 )) U {z[,z 2 } where P(u\,u 2 ) is a path of size at most 2Inn + 1 
(we might need to remove not just z 2 but also z\ from EXP* -1 if case (c) in the algorithm 
holds). Moreover, since by II. 1(b) we have cIr 2 (v, EXPi ) > 3(ln?r) 0 - 4 for every v £ EXPi, 
every vertex v € BAD* must have at least one red neighbour in EXPi \ EXP* -1 . Thus, we 
have |BAD*| < |EXPi \ EXP* -1 | • A(R 2 ) < (2Inn + 3)(i - 1) • A(R 2 ). 

(02) For every i € [t] and every j £ {1,2} we have d R i (vj, V{C\)) — 2 < |R*-| < d K ,(v'-. V(C\ )) for 
some vertex Vj which is at distance at most 2 from Xi in R 2 . For example, if case (a) holds 
in Part i then we have v\ = s + (2{) and v\ = z 2 . Moreover, for every i £ [t] and every vertex 
v £ B\ U B 2 there is a path in R l of length at most 4 from Xi to v. This follows immediately 
from the definition of the sets B). 

(03) We have (v, EXP* -1 ) = N\y t (v, EXP* -1 ) for every v £ V{C\) \ ^(J}=i US^jj an< i 

every i £ [t + 1]. This is because between Phase I and Part i of this phase’s algorithm the only 
edges that are recoloured between V(C\) and EXP* -1 touch vertices of U}=i 0 . 

Also, for the rest of this phase, we shall assume that the following event occurs: 

(El) for every i £ [t] there is no path of size at most 1000 consisting solely of edges in R l between 
any two vertices in TINY. 

Note that this event occurs whp as indicated in 11.3(b). In the next claim we prove that some 
properties which are assumed in the algorithm hold whp. 

Claim 4.8. All of the following properties hold whp: 

(i) For any i £ [f] there always exists a set Si C GOODi of size at least ^n^nn) -0 - 45 such that 
R* -1 [Sj] is a connected graph of diameter at most 2 In n. 

(■ ii ) For any i £ [f], in Part i, after recolouring all the edges in W* -1 between the sets Si and 
B\ U B 2 , there exist y\ £ B\, y 2 £ B\ and u \, u\ € Si such that y*u* £ R 1 for j £ {1,2}. 

Proof of Claim \ 4 ~F\ We start by proving that (i) holds whp. Assuming that A(R, 2 ) < 40Inn (which 
holds whp after Phase II, by N.l), we have by (01), II.1(a) and 11.3(a) that 

|GOODj| > |EXP* -1 | - |BADj U A^ 2 (BAD,;)| - 2 

> |EXPi| — (2Inn + 3) • (i — 1) — |BAD,| • (1 + 40Inn) — 2 > ( 1- 

\ Inn 

We remark that the —2 after the first inequality is necessary if case (c) holds (for case (a) one only 
needs —1). Thus, by II.1(c) there is always a set S t C GOODj with the desired properties. 


^ |EXPi 
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Next we show that (ii) holds whp. By (02) we know that for every i E [t] and every j E {1,2} 
we have \B l -\ > d R i (Vj, V(C\)) — 2 for some vertex Vj which is at distance at most 2 from Xi in i? 2 - 
We claim that whp d R i(v l j,V (C\)) > (inn ) 0,5 for every i E [t] and j E {1,2}. 

Note that Vj (j TINY as otherwise u* and Xi would be two vertices in TINY at distance at most 
2 in R l , contradicting (El). Moreover, recall from 11.2(b) that if v'( E SMALLi then d R i(v,V(Ci)) > 
(Inn) 0 ' 5 . If E EXPi U V(C\) then recall from 1.3 that dw x (vj, V(C\)) = (1 — o(l))n and note 
that dw 2 (vj,V(Ci)) = dwi(Vj,V(Ci)) as no edges between the sets EXPi U V(C\) and V(C\) were 
recoloured during Phase II. Thus, using the union bound, Lemma 13.11 and 11.3(a) we see that the 
probability that d R i(Vj, V(C\)) < (Inn ) 0 " 5 for some i E [t] and j E {1,2} is at most 

2 1 ■ Pr [Bin((l - o(l))n,p) < (Inn) 0 ' 5 ] < 2n 0 ' 04 • e"^-° (1) ) lnn = o(l), 

as claimed. Hence, whp \Bj \ > (Inn ) 0 - 5 — 2 for every i E [f] and j E {1, 2}. We assume this hereafter. 

Note that for i / i' we have ( B\ U B^) n (B)' U B%) = 0 since otherwise (02) would imply that 
there is a path in K l of length at most 8 between Xi and xy, contradicting (El). Thus, by (03) and 
1.4, we see that d W i-i(v, Si) = dwi(v, Si) = (1 — o(l))|S}| for every v E B\ U B l 2 and every i E [t]. 

Now, for each i E [t] and j E {1,2}, let (7) C be a subset of size at least (^ — o(l)) \Bj\ > 
|(lnn ) 0,5 such that C\ n C\ = 0. We then see that for any i E [t] and j € {1, 2} the probability that 
there is no edge in R l between C* and S t is at most: 


Pr [Bin((l-o(l))|5i| • \C%p) = o] = (1 - p)( 1 -»( 1 ))|Sd-|Cjl < e -°- 45 -|(lnn ) 0 - 5 < I_ 


Using 11.3(a) and the union bound we see that the probability that for some i E [t] and j € {1,2} 
there is no edge in R 1 between Bj and S t is at most 2t- ^ = o(l). This shows that (ii) holds whp. □ 

Assuming that the properties of Claim 14.81 hold, denoting by R 3 , W 3 and B% the sets of red, 
white and blue edges at the end of this phase’s algorithm, we show that whp the following technical 
conditions hold: 


III. 1 Properties of EXP 2 : 

(a) |EXP 2 | > (l - \U\ > (2 - o(l))n(lnn)-°- 45 . 

(b) for every v E EXP 2 we have d R?j \ Ri (v. EXP 2 ) > 2(lnn) 0 - 4 . 

111.2 Properties of SMALL 2 : 

(a) |SMALL 2 | < |SMALLi | < 2 ne~^° A . 

(b) for every v E SMALL 2 we have d R3 (v,M .2 ) > (Inn) 0 - 5 - 400. 

(c) for every v E SMALL 2 we have Nw 3 (u, EXP 2 ) 7 ^ N\y 1 (u, EXP 2 ) for at most 100 vertices 
u E V (Ci) which are at distance at most 2 from v in R 3 . 

111.3 All the paths ( Pj)j£j are paths in the red cycle CL- 

111.4 In this phase only o(n) edges of By are recoloured red. 
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Note that property 111.2(a) is just a consequence of the fact that SMALL 2 C SMALLi together 
with 11.2(a). Moreover, property III.3 follows immediately from the algorithm. Indeed, E{C l ~ l ) \ 
E(C l ) consists of at most 4 edges (only 3 edges if case (a) holds but 4 if case (b) holds) and, from 
the definition of J*, for each such edge e we remove j(v) from J * -1 for one vertex v E e. Thus, it is 
clear that Pj is still a path in the red cycle C l for every j E ,P. The next few claims show that the 
remaining properties all hold whp, assuming that the properties of Claim 14781 hold. 

Claim 4.9. Properties III. 1(a) and III. 1(b) hold whp. 

Proof of Claim fJ79 [ It follows from (01), 1.1, II.1(a) and 11.3(a) that: 

|EXP 2 | = |EXPi| — |EXP * -1 \ EXP*| > (l- —^ \U\ -t- (2 Inn + 3) > f 1 - -^- 7 ) \U\ 

' \ (In up J \ (In n) z y 

and so this shows that III. 1(a) holds. 

We now show that III.1(b) also holds whp. Suppose d R3 \ Rl (v, EXP 2 ) < 2(lnn ) 0 ' 4 for some 
v E EXP 2 and let i E [t] be the largest possible integer such that d R2 \ Rl (v, EXP* -1 ) > 3(lnn ) 0 - 4 
(this is well defined by 11.1(b)). Note that since Rz\U] = R 2 \U\ we have 

t 

d R3 \ Rl (v, EXP 2 ) = d R2 \ Rl (v, EXP* -1 ) - d R2 \ Rl (v, EXP - 7-1 \ EXP- 7 ). 

j=i 

Recall that EXP - 7-1 \ EXP - 7 C V(P(u{,u 3 2 )) U {z{,z J 2 } C GOODj U {z{,z J 2 } where z{ and z J 2 are 
neighbours of Xj in i? 2 . Moreover, note that by the choice of i we have v € BADj for every j > i. 
Thus, since GOODj- n N^ 2 (BADj) = 0 we get that 

t 

d R3 \ Rl (v, EXP 2 ) > d RAi?1 (u,EXP* -1 )-4(«,7(P(«U))) ~Y, d R2(v,{4,4})- 

j=i 

Observe now that if d R2 ( v , {z {, z J 2 }) > 0 and d Ri (v, {z\ , z J 2 }) > 0 for j / j' then there would exist a 
path in R 3 of length at most 4 between Xj and Xji, contradicting (El). Thus, we can conclude that 

d R3 \ Rl (v, EXP 2 ) > d R2 \ Rl (v, EXP* -1 )— dij 2 (u, V(P(u\,u 2 )))—2 > 3(lnn) 0A -d R2 (v,V(P(u{,4)))-2. 

Since we assumed that d R3 \ Rl (y, EXP 2 ) < 2(lnn) 0 ' 4 , we must have that 

dR 2 (v,V(P(u\,u l 2 ))) > (Inn ) 0 ’ 4 - 2. 

It is easy to see that this implies the existence of two cycles of length 0((lnn)°- 6 ) sharing only the 
vertex v in the graph R 2 since the red path P(u\,u 2 ) has length at most 2 Inn. However, if N.3 holds 
then this does not happen. Since N.3 holds whp we conclude that III. 1(b) holds whp as desired. □ 

Claim 4.10. Properties 111.2(b) and 111.2(c) hold whp. 


Proof of Claim \2ffT0 . First we show that whp 111.2(b) holds. Note that for every v E SMALL 2 we 
have 

t 

d R3 (v,M 2 ) > d R 3 (v,Mi) - X X dR 3 (v,V(Pj )) 

*=1 
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since M\ \ ^Ui=i U ye +-i\ji V'(Pj)^ C M 2 ■ Also, since \ J l 1 \ J l \ < 4 and since \V(Pj)\ = 100 we 
have that for every i E [t] 

E ^ 3 (n,y(P,))< 400. 

jeJ i ~ 1 \J i 

Observe now that if for i / i! , j E J *^ 1 \ J l and / € J * ,_1 \ J 1 ' we have d Ra (v,V(Pj )) > 0 
and dji 3 (v,V(Pj')) > 0 then there is a path in P3 of size at most 2 ■ 102 + 2 + 1 = 207 between 
a;* and Xj'. Indeed, this follows from the fact that every vertex in V{Pj) is at distance at most 
3 + (| V(Pj)\ — 1) = 102 of Xi in R l and similarly that every vertex in V(Pji) is at distance at most 
102 of Xi 1 in R l . However, if the event (El) holds then there does not exist such a path. Since (El) 
holds whp we conclude that whp 

d R 3 (v,M 2 ) > d R 3 (v,Mi) - 400, 
and so by 11.2(b) we see that 111.2(b) holds. 

Now we show that whp 111.2(c) holds. Note that by (03) it is enough to show that for every 
v E SMALL2 there are at most 100 vertices u E (j( =1 (-£>i U B 2 ) which are at distance at most 2 from 
v in R 3 . Assume this is not the case for a vertex v E SMALL2. Note first, if for some j 7^ f the 
sets B-[ U B 2 and B{ U B 2 each contain a vertex which is at distance at most 2 from v in R%, then 
by (02) it follows that there is a red path from Xj to xy of length at most 4 + 2 + 2 + 4 = 12 in 
P3. However, this cannot occur if the event (El) holds. Since (El) holds whp we conclude that whp 
for every v € SMALL2 there is at most one j € [t] for which B 3 U B 3 2 contains a vertex which is at 
distance at most 2 from v in P3. Moreover, if for some j E [t] the set B{ U B 3 2 contains at least 100 
vertices which are at distance at most 2 from v in P3 then by (02) one can find at least four trails 
of length at most 4 + 2 = 6 between v and Xj. However, this cannot occur if N.4 holds. Since N.4 
holds whp we conclude that whp 111.2(c) holds. □ 

Claim 4.11. Property III.4 holds whp. 


Proof of Claim \J. 1 1\ Recall that in Part i of the algorithm we recolour edges in two situations. In 
the first situation we recolour edges between two vertices (if case (a) holds then these vertices are 
s + (zj) and z 2 ) and V(Ci). In the second situation we recolour edges between vertices in B[ U B 2 
and Si. From the algorithm it is easy to see that the number of edges in the first situation which are 
recoloured red is at most \B\\ + \B 2 \ +4. For each i E [t] and j E {1,2} let E l f 3 be the event that 
\Bj\ > (Inn) 1 ' 1 and let E 2 3 be the event that e R i\ R i-i (Bj, Si) > 80(lnn) L65 . Notice that if none of 
these events hold then clearly at most (2(ln n) 1 ' 1 + 4) • t + 80(ln n) 1,65 • 2 1 = o(n ) edges are recoloured 
red in this phase. Note that Aie[t] je{ 1 2} 1’ 3 holds whp by (02) and N.l. Moreover, using the fact 

that Si CU and that \U\ < 4n(lnn) -0 ' 45 by 1.1, we have by Chernoff (Lemma 13.ID that 


Pr 


E l f 3 A E\ 


ho 


< Pr 

< Pr 


E 2 3 I E \ 3 < Pr [Bin ((Inn) 1 ' 1 • 4n(lnn) °' 45 ,p) > 80(lnn) L65 ] 


Bin ( 4n(lnn) 0 ' 65 , ) > SO(lnn) 1 ' 65 


n 


< e 3 


M(l nn ) 1.65 


Thus, using the union bound and 11.3(a), we see that the probability that the number of edges 
recoloured red during this phase is larger than (2(lnn) 1 ' 1 + 4) • t + SO(lnn) 1 ' 6 ' 5 • 2 1 is at most 


Pr 

V E i J 

+ S Pr 

Elf 3 A E \ 3 


ie[t],je{ 1,2} 

1,2} 



< o(l) + 2 1 ■ e 3 


-f (Inn ) 1 - 65 _ 


= o(l) 
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This, together with 11.3(a), shows that whp III.4 holds. 


□ 


We have shown that whp at the end of this phase all of the properties III. 1-III.4 hold. We shall 
assume henceforth that all these properties hold for the sets R 3 , W 3 and B 3 . 

4.5 Phase IV 

In this phase, we want to recolour some edges in W 3 in order to find a red cycle C 3 containing 
SMALL 2 U VXC 2 ) in a such a way that EXP 3 = V \ ^(£ 3 ) is a “good expander” as described in 
the outline. The algorithm for this phase is similar in spirit to the one of Phase III. It is divided 
into three stages. In Stage 1 we define notation and sets that will be useful for us throughout the 
algorithm. Stage 2 is the main stage of the algorithm and is divided into s = ISMALL 2 I parts. For 
each i E [s] we denote by R 1 and VE*, respectively, the sets of all edges which are coloured red and 
white at the end of Part i. During Part i, we recolour “some” edges in VE * -1 in order to obtain a 
red cycle C* (i.e. consisting solely of edges in R l ) such that V{C l ) contains VXC 2 ), i vertices from 
SMALL 2 and the vertex set of a “small” red path in EXP 2 . Finally, in Stage 3 we define sets needed 
for later phases. The algorithm for this phase is as follows: 

Algorithm: Fix an enumeration 2/1 , 2 / 2 5 ■ • • ,Vs of the vertices in SMALL 2 , where s = ISMALL 2 I, 
set C° := C 2 , EXP 0 := EXP 2 , R° := R 3 and W° := W 3 and for each j E J let Mj denote the set of 
inner vertices of the path Pj. 

Let (Ji)i e [ s ] be disjoint subsets of J of size 10 3 (lnn)°- 45 such that d R?j (y,;, Mj) > 0 for every j € J*. 
Claim IT. 121 ensures that such sets exist and we assume that henceforth. For each j E J % let nrij E Mj 
be such that yirrij E R 3 . For i = 1,2,..., s execute the following routine: 

Routine: Fix a cyclic orientation of the vertices in the red cycle C* -1 and denote for each 
v E C(C* -1 ) by s + (u) and s~{v ) the successor and predecessor of v in the cycle C * -1 according to 
this orientation, respectively. 

Set BADj := {u E EXP * -1 : d R 3 (v, EXP i-1 ) < 2(lnn) a4 }, GOOD, := EXP * -1 \ (BAD,; U 
A^ 3 (BADj)) and let S) C GOODj be a subset of size at least ^n^nn ) -0 ' 45 such that R ?- 1 [5j] 
is a connected graph of diameter at most 2 In n. Claim 14.121 ensures that whp such a set S, always 
exists and we assume that henceforth. 

Define A, := {s + (mj) : j E J,} and recolour all the edges in W l ~ x between the set A, and S',. 
Letting R 1 and W l to be respectively the sets of red and white edges at this point, let u \, u\ E S 7i 
and j\, j l 2 E J, be distinct indices such that s + (mji)u\ E R l and s + (mji)u l 2 E R l . Claim ITT21 ensures 
that whp such vertices u \, u\ and distinct indices j\,j 2 always exist and we assume that hereafter. 

Set 

Q\ := s + {jJiji)... s - (m J i)m J i?/im J iS - (m J i)... s + (m J i) 

to be the red path from s + (i7iji) to s + (rrijP) which contains yi and is obtained from C * -1 by deleting 
the red edges rriji s + (m^j) and m J iS + (m J i) and adding the red edges rriji/yi and yirriji. Let Q l 2 be a 
path inside S) from u\ to u 2 of length at most 2 In n consisting solely of edges in R l (such a path 
exists by the choice of Si). Set C l to be the red cycle formed by joining the red paths Q\ and Q 2 
with the red edges s + (rrijR)u\ and s + {mjP)u 2 . To end the routine, set EXP* := EXP * -1 \ V(C l ). 

(End of Routine) 

To finish the algorithm set EXP 3 := EXP S and C 3 := C s . 

(End of Algorithm) 
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We make a few observations about the procedure above which will be useful for later: 

(01) For every i G [s] we have (EXP*” 1 \EXP*| < 21nn + l since EXP'^ 1 \EXP* = V(Q l 2 ) which has 
size at most 2Inn + 1. Moreover, since by III. 1(b) we have 4 r 3 (u,EXP 2) > 2(lnn) 0 ' 4 for every 
v G EXP2, every vertex in BAD,; must have at least one red neighbour in EXP2 \ EXP 1-1 . 
Thus, we have |BAD,| < |EXP 2 \ EXP* _1 | • A (R 3 ) < (21nn + l)(i - 1) • A (R 3 ). 


(02) For every i G [s + 1] we have N W i~i (v, EXP* ) = Nw 3 (v, EXP 1 ) provided v G V(C 2 ) \ 


(it 1 , 


J fc=1 Akj . Indeed, this holds since before Part i of the routine we only recolour edges between 
the sets Ak and EXP fc ~ 4 for k G [i — 1]. 


(03) For every i G [s + 1] and every j G J \ ^Ul=i{ji > jflj the path Pj is still a path in the red 
cycle Indeed, this follows by induction on i using the facts that m r G M-k, m,k G M-k 

J 1 -'1 J 2 J2 

and that the sets J & are disjoint for k G [i — 1]. 


(04) The sets (A,)j 6 [ s ] are disjoint and N W i-i (v, EXP* 4 ) = A r w3(u,EXP* 4 ) for every v € A, and 
every i E [s]. Indeed, (03) together with the fact that rn 3 G Mj for every j G Ji ensures that 
Ai C Ujej, V(Pj) for every i G [s]. Since the sets (Ji)i S [ s ] are disjoint, the first observation 
follows. The second observation follows from the first one together with (02). 


In the next claim we prove that some properties which are assumed in the algorithm hold whp. 


Claim 4.12. All of the following properties hold whp: 

(i) There exist disjoint subsets (J,)j S M of J of size 10 3 (lnn) 0,45 such that dn ?J (jji, Mj) > 0 for every 
j € Ji. 

(ii) For every i G [s] there exists Si C GOODi of size at least 25 on(lnn)~ 0 ' 45 such that is 

a connected graph of diameter at most 2Inn. 


(in) For every i G [s], after recolouring the edges in W l 1 between the sets Ai and Si, there exist 
u\,u\ € Si and distinct indices j\,j\ € Ji such that s + (m^)u\ G R l and s + (m J i)n2 € R l ■ 

Proof of Claim \4J^ We show first that whp ( i) holds. Let G aux be the bipartite graph with parts 
[s] and J and edge set {ij : i G [s],j G .7, dn 3 (y t , Mj) > 0}. We want to show that whp there are 
disjoint subsets (J,), e [ s ] of J of size 10 3 (lnn)°' 45 such that J, C Nc aux (i) for every i G [s]. In light of 
Lemma 13.81 it suffices to show that whp |^Vc aux (7)| > 10 3 (lnn)°’ 45 |/| for every I C [s]. With this in 
mind, suppose that I C [s] is such that |A r G aux (7)| < 10 3 (lnn)°' 45 |I|, and set X := {y, : i G 1} and 
Y := U,- e jv G (/) Mj. Note that we have 

\X U Y\ = \X\ + 98|iY Gaux (/)| < 10 5 (lnn) a45 |X| 


and, using 111.2(b) and the fact that Nr 3 (X ) n A^2 C Y, we also have 


e R3 (X U Y) > ((lnn) u ' 5 - 400) \X\ > 


(Inn) 0 ' 5 — 400 
10 5 (lnn)°- 45 


x u y I > 


(Inn) 


0.05 


10 6 


\XUY\. 


By (PS) of Lemma l3.5l we see then that whp |XUT| > ^(lnn) 0 ' 05 • ^n(lnn) 1 = 2 xio v n ^ nn ^ °' 95 - 
But in that case we have 


|SMALL 2 | > \X\ > 


\X U Y\ 


> 


n 


10 5 (lnn)°- 45 2 x 10 12 (ln?r) 1 - 4 
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which contradicts 111.2(a). We conclude that whp (i) holds. 

Next we show that whp (ii) also holds. Indeed, assuming that A(R^) < 40Inn (which holds whp 
after Phase III, by N.l), we have by (01), III.1(a) and 111.2(a) that 

I GOOD* | > | EXP* -1 1 - |BAD,j U N Ra (BAD,)) 

> |EXP 2 | — (2 Inn + 1) • (i — 1) — |BADj| • (1 + 40Inn) > (l-|EXPi|. 


Thus, by II. 1(c) there is always a set S) C GOODj with the desired properties. 

Finally, we show that whp (in) also holds. By (04), for every i G [s] we have d W i-i(u, Si) = 
dw 3 (u, Si) for every u € A*. Moreover, since every vertex u € Ai is at distance at most 2 in R% from 
Hi € SMALL 2 , it follows from 111.2(c) and 1.4 that d W i-i(u, Si) = d-^^u^Sf) = (1 — o(l))|Sj| for all 
but at most 100 vertices u G A*. Let Di be the set of these “bad” vertices. For j G {1,2} and i G [s] 
let B) C A{ \ Di be sets of size at least 400(ln n) 0 ' 45 such that B\ n B l 2 = 0 (this is possible since 
\Ai\ = \ Ji\ = 10 3 (lnn) 0 45 and \Di\ < 100). Thus, for any i G [s] and j G {1,2} the probability that 
there is no edge in R l between B'- and Si is at most: 


Pr [Bin ((1 — o(l))|S}| • \B)\,p) = 0] = (1 - p) 




< e 


- —' n(ln ri )-°' 45 - 4 °°( ln n ) 045 


1 

< -. 

n 


Using 111.2(a) and the union bound we see that the probability that for some i G [s] and j G {1,2} 
there is no edge in R l between B) and Si is at most 2s • ^ = o(l). Since the sets B[ and B\ were 
chosen to be disjoint, we conclude that whp (in) holds. This completes the proof of the claim. □ 


Assuming that the properties of Claim 14.121 hold, denoting by R 4 , W 4 and B 4 the sets of red, 
white and blue edges at the end of this phase’s algorithm, we show that whp the following technical 
conditions hold: 


IV.l Properties of EXP3: 

(a) |EXP 3 | > (1 - t^) \U\ > (2 - o(l)Mlnn) -0 - 45 . 

(b) for every v G EXP3 we have df i4 \ Rl (v, EXP3) > (Inn) 0 ' 4 . 

IY.2 In this phase only o(n) edges of W 3 are recoloured red. 

The next few claims show that properties IV.1-IV.2 all hold whp, assuming that the properties 
of Claim I4T21 hold. 


Claim 4.13. Properties IV. 1(a) and IV. 1(b) hold whp. 


Proof of Claim \ f.13 


Note that by (01), III.1(a) and 111.2(a) we have: 


1 - 1 S ] \U\ — s ■ (21nn + 1) = (l - \U\ 

(In n) z J \ In n) 

and so this together with 1.1 shows that IV.1(a) holds. 

We now show that IV.1(b) holds whp. Suppose d R 4 \ Rl (v, EXP3) < (Inn) 01 for some v G EXP3 
and let i G [f] be the largest possible integer such that d Rs \ Rl (v, EXP 1-1 ) > 2(ln n) 0A (this is well 
defined by 111.1(b)). Note that by the choice of i we have v G BADj for every j > i. Moreover, since 


|EXP 3 | = |EXP 2 | - |EXP* -1 \ EXP*| > 


2—1 
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R. a [U] = R 3 [U], EXP ^'- 1 \ EXP j = V{Q{) c GOOD., and GOOD^ n Nr 3 ( BAD;) = 0 for any j > i 
we have 


d RARl (v, EXP 3 ) = d Ra \ Rl (u,EXP* _1 ) - d Rs \ Rl (v, V(Q j 2 )) > 2(ln nf A - d^[v,V{Q\)). 

j=i 

Thus, since we assume that d Ri \ Rl (u, EXP 3 ) < (In n) a4 , we must have that d Ra ( v , V ( Q\ )) > (In n) 0 ' 4 . 
It is easy to see that this implies that there exist two cycles of length 0((lnn) 0 ' 6 ) sharing only the 
vertex v in the graph R 3 (since Q\ is a path in R 3 of length at most 21nn + l). However, if N.3 holds 
then this does not happen. Since N.3 holds whp we conclude that IV. 1(b) holds whp as desired. □ 

Claim 4.14. Property IV.2 holds whp. 


Proof of Claim \4-141 Recall from the algorithm that in this phase the only edges recoloured are the 
ones in W 3 between the sets A. t and Si for each i E [s]. Since for each i E [s] we have \Ai\ = \Jf\ = 
10 3 (lnn ) 0 ' 45 and \Si\ < \U\ < 4n(lnn ) -0 ' 45 by 1. 1 , the total number of edges which are recoloured 
in this phase is at most s ■ 4 • 10 3 n < 8 • 10 3 n 2 e~( ln ”) ’ , by 111.2(a). Thus, using Chernoff (Lemma 
13.11) and the fact that np < 10Inn we see that the probability that more than 1.6 • 10 5 neR lnn ) Inn 
edges in W 3 are recoloured red in this phase is at most: 


Pr 


Bin 


10 3 n 2 e-( lnn )°' 4 ,p 


> 1.6 • 10 5 ne -( lnn )°' 4 In 


n 


< e -i- 810 4 ne-( ln? *) 


Inn 


= 0 ( 1 ). 


Since 1.6 • 10 5 ne Onn)°- 1 Inn = o(n), this concludes the proof of the claim. 


□ 


We have shown that whp at the end of this phase all of the properties IV.1-IV.2 hold. We shall 
assume henceforth that all these properties hold for the sets R 4 , W 4 and B^. 


4.6 Phase V 

In this phase we create a Hamilton cycle in the red graph by merging the red cycle C 3 with the set 
EXP 3 , by recolouring red o(n) edges. To this end, note first that i? 4 [EXP 3 ] has a bounded number 
of connected components. Indeed, suppose C is a connected component of ii^EXPs]. It follows 
from properties IV.1(a), IV.1(b), i. of II.1(c) and the fact that R 4 [EXP 3 ] = i? 2 [EXP 3 ], that the set 
C has size \C\ > g^n^nn ) -0 ' 45 (since IV^ 2 [ 5 ](C) C C). However, since by 1.1, the set EXP 3 has 
size |EXP 3 | < 4n(lnn) -0 ’ 45 , we can conclude that the graph ILjEXPs] has at most 24000 connected 
components. By recolouring red less than 24000 white edges in EXP 3 we can then whp make the 
red graph in EXP 3 connected. 

Afterwards, we consider two adjacent vertices v, w in the red cycle C 3 which have large white degree 
onto EXP 3 . By recolouring edges between v , w and EXP 3 we can then whp find EXP 3 such 

that vx and wy are red edges. Finally, by recolouring red at most |EXP 3 | edges inside EXP 3 we can 
find a red Hamilton path from x to y inside the set EXP 3 . This path together with the red path 
C 3 \ {vw} and the red edges vx and wy then provides the desired red Hamilton cycle in V. The 
algorithm for this phase is as follows: 

Algorithm: Let C\,... .Cf be the connected components of the graph i? 4 [EXP 3 ] where, as 
indicated above, we have I < 24000 and \Ci\ > gggQn(lnn ) -a45 for every i E \f\. For 1 < i < I 
recolour white edges between C* and Ci + \ one by one until exactly one edge is recoloured red for 
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each such i. Claim 14.151 ensures that this happens whp and we assume this henceforth. Note that 
this procedure makes the red graph in EXP 3 connected. 

Next, let v,w be adjacent vertices of C 3 such that dw 4 (v, EXP 3 ) > ||EXP 3 | and dw 4 (w, EXP 3 ) > 
||EXP 3 |. Recolour edges in W 4 between {v,w} and EXP 3 until there exist two edges vx and wy 
which are red, where x,y E EXP 3 are distinct vertices. Claim I3~l5l ensures that whp such vertices v, 
w, x and y exist and we assume this henceforth. Now, set e = xy and run the following routine: 

Routine: Consider the graph H which is the current red graph on EXP 3 . If H U {e} does not 
contain a Hamilton cycle that uses the edge e, recolour e-boosters of H which are white edges one 
by one until one of them is recoloured red. Repeat this procedure until the graph H U {e} considered 
contains a Hamilton cycle which uses the edge e. 

(End of Routine) 

Claim 14.151 ensures that whp this procedure is successful and we assume this henceforth. A 
Hamilton cycle in H U {e} which uses the edge e then provides a red Hamilton path in EXP 3 from 
x to y. This red path together with the red path C 3 \ {vw} and the red edges vx and wy forms the 
desired red Hamilton cycle in V. 

(End of Algorithm) 

In the next claim we prove that some properties which are assumed in the algorithm hold whp. 
Claim 4.15. All of the following properties hold whp: 

(i) For every 1 < i < i, if we recolour all the white edges between Ci and Ci +1 then at least one 
edge is recoloured red. 

(ii) There exist adjacent vertices v, w inC 3 such that dw 4 (v, EXP 3 ) > ^\EXPs\ and dw 4 (w, EXP3) > 
11 EXP3 1. Moreover, after recolouring all the edges in W4 between {u,u;} and EXP3, there exist 
distinct vertices x, y E EXP 3 such that vx and wy are red edges. 

(Hi) At any point during the routine, if H is the graph considered and if we recolour all the e-boosters 
of FI which are white edges then there will be one which is recoloured red. 


Proof. We start by showing that (i) holds whp. Indeed, this follows from Lemma 13.41 since for every 
i we have |Q| > g^Qn(lnn) -0 ' 45 . 

Next we show that (ii) holds whp. Recall from 1.4 that for every u E V(C 1) we have dw 4 (u,U) = 
(1 — o(l))\U\. Moreover, since IEXP3I = (l — o(l))\U\ by IV. 1 (a) it follows that for every u E V(C\) we 
have dw 4 (u, EXP3) = (1 — o(l))|EXP3|. Recall now that in Phase II there were no edges recoloured 
between EXP3 and V(C\) and that in Phases III and IV the number of vertices u E V(C 1) for which 
we recoloured edges touching u and vertices of EXP3 is o(n). Thus, it follows that for all but o(n) 
vertices u E V(C\) we have dw 4 (u, EXP 3 ) = (1 — o(l))|EXP3|. Since |V(Cs) \ V(Ci)| = o(n) we can 
find two vertices v,w E V(C\) which are adjacent in C 3 and for which d\y 4 (v, EXP3) > ||EXPs| and 
d\v 4 ( w i EXP3) > ||EXP3|, as claimed. 

Partition now the set EXP 3 into two sets A v and A w of size as equal as possible. If we recolour 
all the edges in W 4 between {v, w} and EXP 3 then the probability that afterwards either there is no 
red edge vx with x E A v or wy with y € A w is at most 
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2(l-p)(|-°( 1 ))|EXP3| < e -(i-o(l))|EXP 3 b = o(!) 


by IV. 1(a). We conclude that property (ii) of the claim holds whp. 
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Finally we show that (in) also holds whp. Note first that the routine can be executed at most 
IEXP 3 I times since each time the size of a longest path in the graph considered increases by at 
least one. Now, let H be one of the graphs considered during the routine. Note that the edge set 
E(H) of H is of the form E(H) = K\ U K 2 where K\ = i? 4 [EXP 3 ] and K 2 V EXP 3 is a set of size 
| K 2 1 < IEXP 3 I + £. Indeed, It 2 = E(H) \ I\\ consists of the £ — l red edges added in this phase 
connecting the connected components of K\ and at most IEXP 3 I red edges added during the routine. 
Moreover, note that the graph H is connected since the red graph on EXP 3 is already connected 
when the routine is executed. Since I? 4 [EXP 3 ] = it^fEXPs] and since the graph (R 4 \ Ri)[EXP 3 ] 
has minimum degree at least (Inn ) 0 ' 4 by IV. 1(b), it follows from i. of II. 1(c) that for any set 
X C EXP 3 of size \X\ < g^n(hin )~ 0 ' 45 we have \Nh(X ) U X\ > 5|X|. Thus, again using the 
fact that H 4 [EXP 3 ] = I? 2 [EXP 3 ] and that £ < 24000 we conclude from II. 1 (d) that the number of 
e-boosters for H in the set IF 2 [EXP 3 ] is at least 10~ 8 n 2 (lnn) _0 ' 9 . 

Recall now that in Phases III and IV no white edges inside EXP 3 were recoloured and so 
IFhfEXPs] = R/gjEXPs]. Moreover, as indicated above, in Phase V we recolour red less than 
£ + IEXP 3 I white edges inside EXP 3 . The probability that at least 10 _ 9 n 2 (lnn )^°' 9 white edges 
are recoloured blue during Phase V is then at most the probability that at most £ + IEXP 3 I — 1 < 
5n(lnn ) -0 45 white edges are recoloured red before 10 _ 9 n 2 (lnn )~ 09 white edges are recoloured blue, 
which is at most 


5n(lnn) °- 45 


X 


10 9 n 2 (lnn) °- 9 + j 


(! ~P) 


10“ 9 n 2 (lnn)-°- 9 


p> 


< 0(n(lnn)-°' 45 ) • °' 9 )A 5n(lnn) 045 . e -p-Q(n 2 (lnn) °- 9 ) 

_ v v ’ ’ \ 5n(lnn)-°- 45 J 

< 0(n(lnn)-°' 45 ) • (O((lnn) 0 ' 55 )) 5n(lnn) "°' 45 • e -^Wi“^) 01 ) = 0 (i) . 


Thus, whp less than 10~ 9 n 2 (lnn )~°' 9 white edges are recoloured blue in this phase. Thus, since the 
set W 4 [EXP 3 ] contains at least 10~ 8 n 2 (lnn ) _0 ' 9 e-boosters for El we conclude that whp at any point 
in the routine there are at least 9 • 10~ 9 n 2 (lnn )~°' 9 e-boosters for H which are white edges (at that 
point). The probability that none of these e-boosters is recoloured red is then at most 

Pr [Bin(9 • 10- 9 n 2 (lnn)-°' 9 ,p) = o] = (l_ p )9'10- 9 n 2 (lnn)-°- 9 < e - p . 9 .10- 9 n 2 (lnn)-°- 9 < e —9-10—Minn) 91 _ 

Thus, since the routine is executed at most £ + IEXP 3 I < 5n(lnn ) -0 45 many times, we conclude by 
the union bound that the probability that at some point in the routine all the e-boosters of H which 
are white edges (where H is the graph being considered at that point) are recoloured blue is at most 

5 n(lnn )-°' 45 • e - 9 -io- 9 n(inn ) 91 = o(1) ^ 


and so property (in) holds whp as claimed. □ 

Assuming that the properties of Claim 14.151 hold and denoting by R 5 the set of red edges at the 
end of this phase’s algorithm, it is clear from it that the graph R 5 contains a Hamilton cycle. 

We claim now that |I ?5 \ R 4 I < £ + 3 |EXP 3 | = o(n). Indeed, in the beginning of this phase’s 
algorithm exactly £ — 1 edges are recoloured red in order to make the red graph in EXP 3 connected. 
Later, we recoloured edges between {u, re} and EXP 3 and so at that point at most 2 IEXP 3 I edges are 
recoloured red. Finally, we recoloured one edge red each time the routine was executed. However, 
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as indicated in the proof of Claim [¥.151 the routine can be executed at most IEXP 3 I times. Thus, we 
conclude that less than £ + 3 IEXP 3 I = o(n) edges are recoloured red in this phase, as claimed. 

To finish the proof of Theorem |T] we show that |i?s| = n + o(n). Indeed by 1.2, II.4, III.4, IV.2 
and by the previous paragraph we conclude that 

I-R 5 I = |-Ri| + \R 5 \ Ri\ = n + o(n) 


as desired. 

5 Concluding remarks 

In this paper we introduced a new type of problems in random graphs, where the goal is to expose 
a subgraph which possesses some target property V, by asking as few queries as possible. Note that 
this problem is general and can be considered in any model of random structures. 

Although we chose to focus on the property of Hamiltonicity, our proof method can be applied to 
prove analogous statements regarding other interesting properties. For example, one can show that 
for p > ln n +V0 , there exists an adaptive algorithm, interacting with the probability space G(n,p), 
which whp finds a matching of size |_n/ 2 j (a perfect matching) after getting n /2 + o(n) positive 
answers. 

Let us now show that one cannot get rid of the o(n) term. More precisely, we show that whp at 
least n + Q positive answers are needed in order to find a Hamilton cycle. In particular, if 

p = 0 then this means that at least 0 ^^/logr) ex t ra positive answers are needed to find a 

Hamilton cycle. For this aim, let k := k(n) be an integer and let G be a non-Hamiltonian graph on 
n vertices with exactly n + k edges. Suppose that there exist a non-edge xy of G for which G + xy is 
Hamiltonian, and observe that G contains at most two vertices of degree 1 and no isolated vertices. 
First, let us note that both x and y can have at most one neighbor in G which is of degree 2. Indeed, 
every neighbor of (say) x which has degree exactly 2 must be connected to x on the Hamilton cycle 
created by adding xy to G. Since xy must be an edge of this cycle, x cannot have more than one 
such neighbor. Second, we try to estimate the number of pairs xy for which both x and y have at 
most one neighbor of degree 2 in G. For this end, let A denote the set of all vertices in G of degree 
distinct than 2 , and let a := |A|. Since G has exactly n + k edges, and since there are at most 2 
vertices of degree 1, it follows that 2 + 2 (n — a) + 3(a — 2) < 2n + 2k. Therefore, we have a < 2k + 4. 
Next, since 2(n — a) + YlveA ^g{+) <2n + 2k, using the estimate we obtained on a we conclude that 

\N(A)\<Y^d G (v) = 0(k). 

ve A 

Thus, all in all, we have \A U N(A) \ = 0{k). Observe now crucially that if xy is a non-edge of G for 
which G + xy is Hamiltonian then x and y must be in A U N (A). Indeed, if (say) x (j A then x must 
have degree 2 and so at least one of its neighbours lies in A (as discussed above) and so x G N(A). 
Hence, there are 0(k 2 ) pairs xy for which G + xy might be Hamiltonian. Suppose now that we have 
an adaptive algorithm, interacting with the probability space G(n,p), which whp finds a Hamilton 
cycle after getting at most n + k + 1 positive answers. Let G be the random graph obtained by the 
algorithm whose edges correspond to the positive answers until the step just before a Hamilton cycle 
is found. Note that by hypothesis whp G has at most n + k edges and so by the reasoning above, it 
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follows that there are at most 0(k 2 ) possible non-edges of G which can be queried to turn G into a 
Hamiltonian graph. However, if k = o > si nce k 2 p = o(l), by conditioning on what the graph 

G can be, we see by Markov’s inequality that whp none of these pairs of vertices obtain positive 
answers even if we query all of them. Thus, we conclude that no such algorithm exists. 

Note that even though the general setting we introduced appears to be new, there has been some 
previous work of this flavor in the literature, albeit inexplicit. For example, the DFS-based argument 
of [33] indicates that in the super-critical regime p = . Q(s)n positive answers suffice to uncover 

typically a connected component of size proportional to en, and this is clearly optimal. The analysis 
from |15| also gives an adaptive algorithm for finding a path of length 0(e 2 )n (which is whp the 
asymptotic order of magnitude of a longest path in such a random graph) after querying 0 (e)n 
edges successfully. What matters here is the dependence on e, and the above stated algorithmic 
bound is above the trivial lower bound by the 1/e factor. In a companion paper [9] we show that 
this gap cannot be bridged and the algorithm from m is essentially (up to a log(l/e) factor) best 
possible. 

Another natural instance of the setting promoted in this paper is when the target property V is 
the containment of a fixed graph H. In this case, the obvious lower bound for the total number of 
queries needed is of order l/p. It appears that the form of the optimal bound for the number of 
queries may depend heavily on the value of p. Consider for example the case H = K 3 . For constant 
p, one can just query the pairs in w(l) pairwise disjoint triples of vertices to find w.h.p. a copy of the 
triangle. However, say, for the case p = n -1 / 2 the right bound seems to be around n 3 / 4 . Indeed, a 
simple algorithm asking a bit more than n 3 / 4 queries would be first to query pairs containing a fixed 
vertex v till (^(n 1 / 4 ) edges touching v are found - this would take 1 u(n 3 / 4 ) queries. Querying now all 
pairs between the other points of these edges uncovers w.h.p. an edge (u, w) closing a triangle with 
v. For the lower bound, one can argue that having o(n 1 / 4 ) positive answers on the board produces 
only o(n 1//2 ) pairs of vertices at distance two, and w.h.p. none of these pairs will show up in the 
random graph to close a desired triangle. This argument has certain similarities to the lower bound 
for avoidance of a given graph in Achlioptas processes given in HU ■ Of course the case of triangles 
appears to be relatively easy, and we expect much more involved analysis for a general H. 
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